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Preface 


This edition of Elementary Linear Algebra gives an introductory treatment of linear algebra that is suitable for 
a first undergraduate course. Its aim is to present the fundamentals of linear algebra in the clearest possible 
way—sound pedagogy is the main consideration. Although calculus is not a prerequisite, there is some 
optional material that is clearly marked for students with a calculus background. If desired, that material can 
be omitted without loss of continuity. 


Technology is not required to use this text, but for instructors who would like to use MATLAB, Mathematica, 
Maple, or calculators with linear algebra capabilities, we have posted some supporting material that can be 
accessed at either of the following Web sites: 


www.howardanton.com 


www.wiley.com/college/anton 


Summary of Changes in this Edition 


This edition is a major revision of its predecessor. In addition to including some new material, some of the old 
material has been streamlined to ensure that the major topics can all be covered in a standard course. These 
are the most significant changes: 


Vectors in 2-space, 3-space, and n-space Chapters 3 and 4 of the previous edition have been combined 
into a single chapter. This has enabled us to eliminate some duplicate exposition and to juxtapose concepts 
in n-space with those in 2-space and 3-space, thereby conveying more clearly how n-space ideas generalize 
those already familiar to the student. 


New Pedagogical Elements Each section now ends with a Concept Review and a Skills mastery that 
provide the student a convenient reference to the main ideas in that section. 


New Exercises Many new exercises have been added, including a set of True/False exercises at the end of 
most sections. 


Earlier Coverage of Eigenvalues and Eigenvectors The chapter on eigenvalues and eigenvectors, which 
was Chapter 7 in the previous edition, is Chapter 5 in this edition. 


Complex Vector Spaces The chapter entitled Complex Vector Spaces in the previous edition has been 
completely revised. The most important ideas are now covered in Section 5.3 and Section 7.5 in the context 
of matrix diagonalization. A brief review of complex numbers is included in the Appendix. 


Quadratic Forms This material has been extensively rewritten to focus more precisely on the most 
important ideas. 


New Chapter on Numerical Methods In the previous edition an assortment of topics appeared in the last 
chapter. That chapter has been replaced by a new chapter that focuses exclusively on numerical methods of 
linear algebra. We achieved this by moving those topics not concerned with numerical methods elsewhere 
in the text. 


Singular-Value Decomposition In recognition of its growing importance, a new section on Singular-Value 
Decomposition has been added to the chapter on numerical methods. 


Internet Search and the Power Method A new section on the Power Method and its application to 
Internet search engines has been added to the chapter on numerical methods. 


Applications There is an expanded version of this text by Howard Anton and Chris Rorres entitled 


Elementary Linear Algebra: Applications Version, 10® (ISBN 9780470432051), whose purpose is to 
supplement this version with an extensive body of applications. However, to accommodate instructors who 
asked us to include some applications in this version of the text, we have done so. These are generally less 
detailed than those appearing in the Anton/Rorres text and can be omitted without loss of continuity. 


Hallmark Features 


Relationships Among Concepts One of our main pedagogical goals is to convey to the student that linear 
algebra is a cohesive subject and not simply a collection of isolated definitions and techniques. One way in 
which we do this is by using a crescendo of Equivalent Statements theorems that continually revisit 
relationships among systems of equations, matrices, determinants, vectors, linear transformations, and 
eigenvalues. To get a general sense of how we use this technique see Theorems 1.5.3, 1.6.4, 2.3.8, 4.8.10, 
4.10.4 and then Theorem 5.1.6, for example. 


Smooth Transition to Abstraction Because the transition from A" to general vector spaces is difficult for 
many students, considerable effort is devoted to explaining the purpose of abstraction and helping the 
student to “visualize” abstract ideas by drawing analogies to familiar geometric ideas. 


Mathematical Precision When reasonable, we try to be mathematically precise. In keeping with the level 
of student audience, proofs are presented in a patient style that is tailored for beginners. There is a brief 
section in the Appendix on how to read proof statements, and there are various exercises in which students 
are guided through the steps of a proof and asked for justification. 


Suitability for a Diverse Audience This text is designed to serve the needs of students in engineering, 
computer science, biology, physics, business, and economics as well as those majoring in mathematics. 


Historical Notes To give the students a sense of mathematical history and to convey that real people 
created the mathematical theorems and equations they are studying, we have included numerous Historical 
Notes that put the topic being studied in historical perspective. 


About the Exercises 


Graded Exercise Sets Each exercise set begins with routine drill problems and progresses to problems 
with more substance. 


True/False Exercises Most exercise sets end with a set of True/False exercises that are designed to check 
conceptual understanding and logical reasoning. To avoid pure guessing, the students are required to justify 
their responses in some way. 

Supplementary Exercise Sets Most chapters end with a set of supplementary exercises that tend to be 


more challenging and force the student to draw on ideas from the entire chapter rather than a specific 
section. 


Supplementary Materials for Students 


* Student Solutions Manual This supplement provides detailed solutions to most theoretical exercises and 
to at least one nonroutine exercise of every type (ISBN 9780470458228). 


* Technology Exercises and Data Files The technology exercises that appeared in the previous edition have 
been moved to the Web site that accompanies this text. Those exercises are designed to be solved using 
MATLAB, Mathematica, or Maple and are accompanied by data files in all three formats. The exercises and 
data can be downloaded from either of the following Web sites. 


www.howardanton.com 


www.wiley.com/college/anton 


Supplementary Materials for Instructors 


* Instructor's Solutions Manual This supplement provides worked-out solutions to most exercises in the 
text (ISBN 9780470458235). 


WileyPLUS™ This is Wiley's proprietary online teaching and learning environment that integrates a 
digital version of this textbook with instructor and student resources to fit a variety of teaching and learning 
styles. WileyPLUS will help your students master concepts in a rich and structured environment that is 
available to them 24/7. It will also help you to personalize and manage your course more effectively with 
student assessments, assignments, grade tracking, and other useful tools. 


* Your students will receive timely access to resources that address their individual needs and will 
receive immediate feedback and remediation resources when needed. 


* There are also self-assessment tools that are linked to the relevant portions of the text that will enable 
your students to take control of their own learning and practice. 


* WileyPLUS will help you to identify those students who are falling behind and to intervene in a 
timely manner without waiting for scheduled office hours. 


More information about WileyPLUS can be obtained from your Wiley representative. 


A Guide for the Instructor 


Although linear algebra courses vary widely in content and philosophy, most courses fall into two categories 
—those with about 35—40 lectures and those with about 25—30 lectures. Accordingly, we have created long 
and short templates as possible starting points for constructing a course outline. Of course, these are just 
guides, and you will certainly want to customize them to fit your local interests and requirements. Neither of 
these sample templates includes applications. Those can be added, if desired, as time permits. 


Long Template Short Template 
Chapter 1: Systems of Linear Equations and Matrices 7 lectures 6 lectures 


Chapter 2: Determinants 3 lectures 2 lectures 


Long Template Short Template 


Chapter 3: Euclidean Vector Spaces 4 lectures 3 lectures 
Chapter 4: General Vector Spaces 10 lectures 10 lectures 
Chapter 5: Eigenvalues and Eigenvectors 3 lectures 3 lectures 
Chapter 6: Inner Product Spaces 3 lectures 1 lecture 
Chapter 7: Diagonalization and Quadratic Forms 4 lectures 3 lectures 
Chapter 8: Linear Transformations 3 lectures 2 lectures 
Total: 37 lectures 30 lectures 
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INTRODUCTION 


Information in science, business, and mathematics is often organized into rows and 


columns to form rectangular arrays called “matrices” (plural of matrix"). Matrices often 
appear as tables of numerical data that arise from physical observations, but they occur in 
various mathematical contexts as well. For example, we will see in this chapter that all of 


the information required to solve a system of equations such as 


5x +y = 3 
2x=y=4 


is embodied in the matrix 


5 135 
2 —-14 


and that the solution of the system can be obtained by performing appropriate operations 
on this matrix. This is particularly important in developing computer programs for solving 
systems of equations because computers are well suited for manipulating arrays of 
numerical information. However, matrices are not simply a notational tool for solving 
systems of equations; they can be viewed as mathematical objects in their own right, and 
there is a rich and important theory associated with them that has a multitude of practical 
applications. It is the study of matrices and related topics that forms the mathematical field 
that we call “linear algebra." In this chapter we will begin our study of matrices. 
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1.1 Introduction to Systems of Linear Equations 


Systems of linear equations and their solutions constitute one of the major topics that we will study in this 
course. In this first section we will introduce some basic terminology and discuss a method for solving such 
systems. 


Linear Equations 


Recall that in two dimensions a line in a rectangular xy-coordinate system can be represented by an equation of 
the form 


ах ру —c (а, b not both 0) 
and in three dimensions a plane in a rectangular xyz-coordinate system can be represented by an equation of the 
form 
ax--by-Fez—d (a, b, c not all 0) 
These are examples of “linear equations," the first being a linear equation in the variables x and y and the second 
a linear equation in the variables x, y, and z. More generally, we define a linear equation in the n variables 


X1, X2, --„ Ху to be one that can be expressed in the form 
а1х1 + 2х2 +... архи = b (1) 
where a1, @3,..., ау and b are constants, and the a's are not all zero. In the special cases where » = 2 or » = 3, 


we will often use variables without subscripts and write linear equations as 


ajx Hazy =È (a1, a2 not both 0) (2) 


aX + азу Fasz—b (aj, a3, a3 not all 0) (3) 
In the special case where b — 0, Equation 1 has the form 
ах + d2X2 +... + as X4 — 0 (4) 
which is called a homogeneous linear equation in the variables x1, x3, ..., Ху. 
EXAMPLE 1 Linear Equations + 
Observe that a linear equation does not involve any products or roots of variables. All variables 


occur only to the first power and do not appear, for example, as arguments of trigonometric, 
logarithmic, or exponential functions. The following are linear equations: 


x+3y=7 X|—2x2—3x3--x4-—0 
ix-y F3z— —1 xj-4x2-4...-x4,-1 


The following are not linear equations: 


x4 3y? —4 Зх + 2у = ху = 5 
sin x + y = 0 "PE 2x2 + х3 = 1 


A finite set of linear equations is called a system of linear equations or, more briefly, a linear system. The 
variables are called unknowns. For example, system 5 that follows has unknowns x and y, and system 6 has 
unknowns X 1,52, and х3. 


5x--y-—3 4xQ—x3-3x3— —1 (5) 
Ax —y-—4 3x, +x + 9x3= —4 (6) 


The double subscripting on the coefficients 43; 
of the unknowns gives their location in the 
system—the first subscript indicates the equation 
in which the coefficient occurs, and the second 
indicates which unknown it multplies. Thus, @12 
is in the first equation and multiplies х2. 


A general linear system of m equations in ће n unknowns x1, x2, ..., Ху can be written as 


a11X1 + 2122 +... A1yX%, = Р 
@21Х| + 4222 +... + азии = 42 (7) 
@т\Х1 + 2X2 +... арии Dm 


A solution of a linear system in n unknowns x1, X2, ..., Ху is a sequence ofn numbers 51, $3, ..., Ху for which 
the substitution 


Д] =S1, 132—982... Ху — бу 
makes each equation a true statement. For example, the system in 5 has the solution 
=1, y= =2 
and the system in 6 has the solution 
x1=1, x2=2, x3= = 1 
These solutions can be written more succinctly as 
(1, = 2) and (1,2, — 1) 
in which the names of the variables are omitted. This notation allows us to interpret these solutions geometrically 
as points in two-dimensional and three-dimensional space. More generally, a solution 
X1 =S1, X2 = 52, --„ Ху = Sn 
of a linear system in n unknowns can be written as 
(51, 52... Sn) 


which is called an ordered n-tuple. With this notation it is understood that all variables appear in the same order 


in each equation. If » = 2, then the n-tuple is called an ordered pair, and if »; = 3, then it is called an ordered 
triple. 


Linear Systems with Two and Three Unknowns 


Linear systems in two unknowns arise in connection with intersections of lines. For example, consider the linear 
system 


aix у —c 
ax + boy — c3 
in which the graphs of the equations are lines in the xy-plane. Each solution (x, y) of this system corresponds to a 
point of intersection of the lines, so there are three possibilities (Figure 1.1.1): 
1. The lines may be parallel and distinct, in which case there is no intersection and consequently no solution. 
2. The lines may intersect at only one point, in which case the system has exactly one solution. 


3. The lines may coincide, in which case there are infinitely many points of intersection (the points on the 
common line) and consequently infinitely many solutions. 


у у 


No solution One solution Infinitely many 
solutions 
(coincident lines) 


Figure 1.1.1 


In general, we say that a linear system is consistent if it has at least one solution and inconsistent if it has no 
solutions. Thus, a consistent linear system of two equations in two unknowns has either one solution or infinitely 
many solutions—there are no other possibilities. The same is true for a linear system of three equations in three 
unknowns 

aix - by b ecizz—di| 

азх + bay + сэ = 42 

азх + bay + с32 = d 
in which the graphs of һе equations are planes. The solutions of the system, if any, correspond to points where 


all three planes intersect, so again we see that there are only three possibilities—no solutions, one solution, or 
infinitely many solutions (Figure 1.1.2). 


7228 0 


No solutions No solutions No solutions | | No solutions 
(three parallel planes; (two parallel planes; |(по common intersection) | | (two coincident planes 
no common intersection) no common intersection) | * | parallel to the third; 


no common intersection) 


One solution Infinitely many solutions | Infinitely many solutions Infinitely many solutions 
(intersection is a point) (intersection is a line) (planes are all coincident; (two coincident planes; 
intersection is a plane) intersection is a line) 


Figure 1.1.2 


We will prove later that our observations about the number of solutions of linear systems of two equations in two 
unknowns and linear systems of three equations in three unknowns actually hold for all linear systems. That is: 


Every system of linear equations has zero, one, or infinitely many solutions. There are no other 
possibilities. 


EXAMPLE 2 ALinear System with One Solution + 


Solve the linear system 
x—-y-l 
2х +y=6 


Solution We can eliminate x from the second equation by adding —2 times the first equation to 
the second. This yields the simplified system 
x—-y-l 
3y —4 
4 


From the second equation we obtain y — 3 and on substituting this value in the first equation we 


obtain x = 1 + y = L Thus, the system has the unique solution 


3 


_7 „_4 
aye т=з 


Geometrically, this means that the lines represented by the equations in the system intersect at the 


single point [s 2) We leave it for you to check this by graphing the lines. 


EXAMPLE 3 ALinear System with No Solutions + 


Solve the linear system 
х+у=4 
Зх + Зу =6 


Solution We can eliminate x from the second equation by adding —3 times the first equation to 
the second equation. This yields the simplified system 

х+у=4 

0= —6 

The second equation is contradictory, so the given system has no solution. Geometrically, this 
means that the lines corresponding to the equations in the original system are parallel and distinct. 
We leave it for you to check this by graphing the lines or by showing that they have the same slope 
but different y-intercepts. 


EXAMPLE 4 ALinear System with Infinitely Many Solutions + 


Solve the linear system 
4х —2y —]1 
16x = 8у =4 


In Example 4 we could have also obtained 
parametric equations for the solutions by 
solving 8 for y in terms of x, and letting 

x = £ be the parameter. The resulting 
parametric equations would look different 
but would define the same solution set. 


Solution We can eliminate x from the second equation by adding —4 times the first equation to 
the second. This yields the simplified system 


4х —2y = 1 
0 = 0 
The second equation does пої impose any restrictions on х and y апа hence can be omitted. Thus, 
the solutions of the system are those values of x and y that satisfy the single equation 


4x —2у=1 (8) 


Geometrically, this means the lines corresponding to the two equations in the original system 
coincide. One way to describe the solution set is to solve this equation for x in terms of y to obtain 


M 


х= 4 | 57 and then assign an arbitrary value ¢ (called a parameter) to y. This allows us to 
express the solution by the pair of equations (called parametric equations) 
I. | 
=> + =f ={ 
x 4 і 2 y 


We can obtain specific numerical solutions from these equations by substituting numerical values 


te o), £ = 1 yields the solution [2 1) 
and; = = 1 yields the solution |-4 = 1) You can confirm that these are solutions by 


4 


substituting the coordinates into the given equations. 


for the parameter. For example, ¢ = () yields the solution 


EXAMPLE 5 ALinear System with Infinitely Many Solutions + 


Solve the linear system 


x=y+2 = 5 
2х = 2у +42 = 10 
Зх = Зу + 62 = 15 


Solution This system can be solved by inspection, since the second and third equations are 
multiples of the first. Geometrically, this means that the three planes coincide and that those values 
of x, y, and z that satisfy the equation 


x—y+2z=5 (9) 


automatically satisfy all three equations. Thus, it suffices to find the solutions of 9. We can do this 

by first solving 9 for x in terms of y and z, then assigning arbitrary values r and s (parameters) to 

these two variables, and then expressing the solution by the three parametric equations 
x=5+r= 2s, y=r, Z=s 


Specific solutions can be obtained by choosing numerical values for the parameters r and s. For 
example, taking y = | and g = 0 yields the solution (6, 1,0). 


Augmented Matrices and Elementary Row Operations 


As the number of equations and unknowns in a linear system increases, so does the complexity of the algebra 
involved in finding solutions. The required computations can be made more manageable by simplifying notation 
and standardizing procedures. For example, by mentally keeping track of the location of the s, the x's, and the 
—'s in the linear system 


011х] + 12х79 4 Б. + ахы = 0 
а2\х\ + anx) 4 ose + ахы = b? 


AmiI + m22 d din E амХхы = dy 


we can abbreviate the system by writing only the rectangular array of numbers 


411 812 5с dj. Bj 
| da `$ Gig Ёз 
mi @m2 "б" mm by 


As noted in the introduction to this chapter, the 
term “matrix” is used in mathematics to denote a 
rectangular array of numbers. In a later section 
we will study matrices in detail, but for now we 
will only be concerned with augmented matrices 
for linear systems. 


This is called the augmented matrix for the system. For example, the augmented matrix for the system of 
equations 


x] +x2+2x3=9 11 29 
2x, + 4х9 = 3х3=1 is |2 4 —3 1 
Зх + 6x2 = 5x3 = 0 36 —5 0 


The basic method for solving a linear system is to perform appropriate algebraic operations on the system that do 
not alter the solution set and that produce a succession of increasingly simpler systems, until a point is reached 
where it can be ascertained whether the system is consistent, and if so, what its solutions are. Typically, the 
algebraic operations are as follows: 


1. Multiply an equation through by a nonzero constant. 
2. Interchange two equations. 
3. Add a constant times one equation to another. 


Since the rows (horizontal lines) of an augmented matrix correspond to the equations in the associated system, 
these three operations correspond to the following operations on the rows of the augmented matrix: 


1. Multiply a row through by a nonzero constant. 
2. Interchange two rows. 
3. Add a constant times one row to another. 


These are called elementary row operations on a matrix. 


In the following example we will illustrate how to use elementary row operations and an augmented matrix to 
solve a linear system in three unknowns. Since a systematic procedure for solving linear systems will be 
developed in the next section, do not worry about how the steps in the example were chosen. Your objective here 
should be simply to understand the computations. 


EXAMPLE 6 Using Elementary Row Operations + 


In the left column we solve a system of linear equations by operating on the equations in the 
system, and in the right column we solve the same system by operating on the rows of the 
augmented matrix. 


х+у+22 = 9 
ax-+4y—3z = 1 
Зх + бу = 52 = 0 


Add —2 times the first equation to the second 
to obtain 


x+y+2z = 9 
2y= 7z = =]? 
Зх + бу = 52 = 0 


Add —3 times the first equation to the third to 
obtain 


х+у+22 = 9 
2y = 72 = =]7 
Зу —llz = -27 
Multiply the second equation by i to obtain 
х+у+22 = 9 
3y —-1llz = -27 


Add —3 times the second equation to the third 
to obtain 


х+у+22 = 9 

lec až 

Ei 2 
Multiply the third equation by —2 to obtain 

х+у+22 = 9 

2 = 3 


Add —1 times the second equation to the first 
to obtain 


1 1 
2 4 —3 1 
3 6 


Add —2 times the first row to the second 
to obtain 


11 2 9 
02 -7 -17 
3 6 —5 0 


Add —3 times the first row to the third to 
obtain 


1 1 2 9 
02 -7 -17 
0 3 =11 -27 
Multiply the second row by i to obtain 
1 1 2 9 
1 LUE 
01 = 2 5 
0 3 =11 -27 


Add —3 times the second row to the third 
to obtain 


11 2 9 
7 1 
0 1 75 5 
1 3 
S 2 2 


Multiply the third row by —2 to obtain 
1 1 2 9 


7 1 
0 1 -5 5 
0 0 1 3 


Add -1 times the second row to the first 
to obtain 


ike ш 28 М 223 

x e = 2 10 5 5 
-— = QU E = 

У 2 2 0-1 5 5 
2 = 3 0 0 1 3 


Add -1 times the third equation to the first Add -1 times the third row to the first 


and 5 times the third equation to the second to and $ times the third row to the second 


obtain to obtain 
x = |] 1001 
y = 2 0102 
т = 3 001 3 


The solution x = 1, у = 2, z= 3 15 now evident. 


+ 
Maxime Bócher (1867—1918) 


Historical Note The first known use of augmented matrices appeared between 200 B.C. 
and 100 B.c. in a Chinese manuscript entitled Nine Chapters of Mathematical Art. The 
coefficients were arranged in columns rather than in rows, as today, but remarkably the 
system was solved by performing a succession of operations on the columns. The actual 
use of the term augmented matrix appears to have been introduced by the American 
mathematician Maxime Bocher in his book Introduction to Higher Algebra, published in 
1907. In addition to being an outstanding research mathematician and an expert in Latin, 
chemistry, philosophy, zoology, geography, meteorology, art, and music, Bócher was an 
outstanding expositor of mathematics whose elementary textbooks were greatly 
appreciated by students and are still in demand today. 

Umage: Courtesy of the American Mathematical Society] 


Concept Review 


Linear equation 


Homogeneous linear equation 
* System of linear equations 

* Solution of a linear system 

* Ordered n-tuple 

* Consistent linear system 

* [nconsistent linear system 

* Parameter 


* Parametric equations 


Augmented matrix 


* Elemenetary row operations 


Skills 

* Determine whether a given equation is linear. 

* Determine whether a given n-tuple 15 a solution of a linear system. 

* Find the augmented matrix of a linear system. 

* Find the linear system corresponding to a given augmented matrix. 

* Perform elementary row operations on a linear system and on its corresponding augmented matrix. 
* Determine whether a linear system is consistent or inconsistent. 


* Find the set of solutions to a consistent linear system. 


Exercise Set 1.1 


1. In each part, determine whether the equation is linear in X1, X2, and X3. 
(a) x14 5x; — (2х3 = 1 
(b) х1 + 3х2 + x1x3—2 
(c) Х1 = — 7х2 + 3x3 
(d) xp? -- x5 + 8х3 = 5 
(е) е = 2х3 + х3=4 
1 71⁄3 


(f) пх = 2х2 I 313— 


Answer: 


(a), (c), and (f) are linear equations; (b), (d) and (e) are not linear equations 


2. In each part, determine whether the equations form a linear system. 


(а) —2х+4у+2=2 
Зх— = 
У 


(py x= 
2x = 8 
(с) 4х = у + 22= = 1 
=x + (№ 2)у —-3z— 0 
(d) 3z+-x= -4 


y+iz= 1 
бх + 22= 3 
—x—y-z-4 
3. In each part, determine whether the equations form a linear system. 
(a) 2x4 = x4— 5 
= хр 5х2 Ф 3х3 = 254= —1 


(b) зш (2х1 ++ x3) = y5 


2252—2541. 


х2 

4х4=4 
(с) 7x1— xg+ 2х3 = 0 
2х] x2—xax4— 3 
= х] 5х3 = хд= —1 


(d) Xp Х2 = х3 Р Х4 


Answer: 


(a) and (d) are linear systems; (b) and (c) are not linear systems 
4. For each system in Exercise 2 that is linear, determine whether it is consistent. 


5. For each system in Exercise 3 that is linear, determine whether it is consistent. 
Answer: 


(a) and (d) are both consistent 

6. Write a system of linear equations consisting of three equations in three unknowns with 
(a) no solutions. 
(b) exactly one solution. 


(c) infinitely many solutions. 


7. In each part, determine whether the given vector is a solution of the linear system 
2x;—4x3—x3-—1 
хү—3х2-Ехз = 1 
3х|—5х2— хз = 1 


(а) (3,1,1) 
(b) (3, -1, 1) 


(с) (13,5,2) 


d) (13 5 
(d) (5 | 2.2) 
(e) (17, 7, 5) 
Answer: 


(a), (d), and (e) are solutions; (b) and (c) are not solutions 
8. In each part, determine whether the given vector is a solution of the linear system 
ху 2x7 = 2х3 = 3 
Зхү= хэ х3 = 1 
=x, 5х2 = 5х3 = 5 


(а) (>. 3, 1) 
(b) a 8, o 
(с) (5. 8, 1) 
ДЕЕ 
III 


9. [n each part, find the solution set of the linear equation by using parameters as necessary. 
(а) 7x = 5y = 5 
(b) —8x1 + 2х2 — 5х3 + 6х4= 1 


Answer: 


(a) zc ЕН 
x atta 


у = ё 
(b) x, = l.- 3; == = 
х2 =” 
X3 = 8 
ха = # 
10. In each part, find the solution set of the linear equation by using parameters as necessary. 
(a) 3x1 — 5х2 + 4x3 = 7 
(b) 3v = 8w + 2x = y -- 4z = 0 
11. In each part, find a system of linear equations corresponding to the given augmented matrix 
(a) |2 00 
3 =—4 0 
0 11 


—2 
| 21 -3 1 
124 01 
(d) [1 00 0 | 
0100 -2 
0010 3 
0001 4 
Answer: 
(a) 2x4 = 0 
3x; = 4x; = 0 
х2 = 1 
(b) 3x1 = 2x3 = 5 
dx + хз + 4х3 = —3 
—2х2 + x3 7 
(с) 7x1. + 2x3 + хз = 3x4 = 5 
X| + 2x2 + 4хз = 
(d) Х1 = 7 
х2 = —2 
X3 E. 
X4 = 4 


12. ш each part, find a system of linear equations corresponding to the given augmented matrix. 


(a) 2 -1 
—4 —6 

1 =1 

3 0 


GTa Xe 7m. a 


Te e m 
S aec МА i 

eB бф ой 3 

(| 301-4 3 
ed Dd. 4 3 
«cs ne. ud 
И D 


13. In each part, find the augmented matrix for the given system of linear equations. 


(а) — 2x| — 6 


E = 8 
9x1 = = 3 
(b) 6x1 = х2 + 3х3 = 4 
5х0 = х3 = 1 
(с) 2х3 —3x4+ ху = 0 


= 3x] — х2 х3 -1 
бху + 2х2 = х3 + 254 = 3х5 = 6 
(d) х1—х5=7 


Answer: 


()|-2 6 
3 8 
9 -3 
6 —1 
МЕ 
0 2 


34 

0 | 
0-3 1 0 
-3-1 1 0 0-1 
6 2-1 2-3 6 


(à [1 0 0 0 —1 7] 


14. In each part, find the augmented matrix for the given system of linear equations. 
(a) 3x1 – 2х2 = —1 
4х + 5х2 = 3 
7x1 + 3хә = 2 
(b) 2x4 + 2х3 = 1 
3x, = х2 H 4x3 = 7 
бху х2 = х3 = 0 


(с) х1 + 2х2 — X44 х5 = 1 

3x2+ х3 =х5=2 

хз 7x4 =l 
(d) х1 =1 
хз =2 
хз= 3 


15. Тһе сшуе у = ах? + bx + c shown in ће accompanying figure passes through the points 
(xi, yO., (x2, y2), and (x3, уз): Show that the coefficients a, b, and c are a solution of the system of 


linear equations whose augmented matrix is 


2 
xj Х| 1 yı 


y=ax +Ьх+‹( 


(Xa. Уз) 


Figure Ех-15 


16. Explain why each of the three elementary row operations does not affect the solution set of a linear system. 
17. Show that if the linear equations 
xy kx2-—candx,--ix5—d 


have the same solution set, then the two equations are identical (1.e., c = 1 and с = 4). 


True-False Exercises 


In parts (a)-(h) determine whether the statement is true or false, and justify your answer. 
(a) A linear system whose equations are all homogeneous must be consistent. 
Answer: 


True 


(b) Multiplying a linear equation through by zero 15 an acceptable elementary row operation. 
Answer: 


False 
(c) The linear system 
x—y-3 
2х = 2у =% 


cannot have a unique solution, regardless of the value of k. 
Answer: 


True 


(d) A single linear equation with two or more unknowns must always have infinitely many solutions. 
Answer: 


True 


(e) If the number of equations in a linear system exceeds the number of unknowns, then the system must be 
inconsistent. 


Answer: 


False 


(f) If each equation in a consistent linear system is multiplied through by a constant c, then all solutions to the 
new system can be obtained by multiplying solutions from the original system by c. 


Answer: 


False 


(g) Elementary row operations permit one equation in a linear system to be subtracted from another. 
Answer: 


True 


(h) The linear system with corresponding augmented matrix 


2 =] 4 
0 0 —1 
is consistent. 


Answer: 


False 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


1.2 Gaussian Elimination 


In this section we will develop a systematic procedure for solving systems of linear equations. The procedure is based on 
the idea of performing certain operations on the rows of the augmented matrix for the system that simplifies it to a form 
from which the solution of the system can be ascertained by inspection. 


Considerations in Solving Linear Systems 


When considering methods for solving systems of linear equations, it is important to distinguish between large systems 
that must be solved by computer and small systems that can be solved by hand. For example, there are many applications 
that lead to linear systems in thousands or even millions of unknowns. Large systems require special techniques to deal 
with issues of memory size, roundoff errors, solution time, and so forth. Such techniques are studied in the field of 
numerical analysis and will only be touched on in this text. However, almost all of the methods that are used for large 
systems are based on the ideas that we will develop in this section. 


Echelon Forms 


In Example 6 of the last section, we solved a linear system in the unknowns x, y, and z by reducing the augmented matrix 
to the form 


from which the solution x — 1, y — 2, z — 3 became evident. This is an example of a matrix that is in reduced row 
echelon form. То be of this form, a matrix must have the following properties: 


1. Ifa row does not consist entirely of zeros, then the first nonzero number in the row is a 1. We call this a leading 1. 
2. If there are any rows that consist entirely of zeros, then they are grouped together at the bottom of the matrix. 


3. [n any two successive rows that do not consist entirely of zeros, the leading 1 in the lower row occurs farther to the 
right than the leading 1 in the higher row. 


4. Each column that contains a leading 1 has zeros everywhere else in that column. 


A matrix that has the first three properties 1s said to be in row echelon form. (Thus, a matrix in reduced row echelon 
form is of necessity in row echelon form, but not conversely.) 


EXAMPLE 1 Row Echelon and Reduced Row Echelon Form + 


The following matrices are in reduced row echelon form. 


The following matrices are in row echelon form but not reduced row echelon form. 


14 -3 7 110 012 60 
01 6 2], |0 Oj, 1001-10 
0 0 


1 
00 15 0 000 01 


EXAMPLE 2 More on Row Echelon and Reduced Row Echelon Form + 


As Example 1 illustrates, a matrix in row echelon form has zeros below each leading 1, whereas a matrix in 
reduced row echelon form has zeros below and above each leading 1. Thus, with any real numbers substituted for 
the *'s, all matrices of the following types are in row echelon form: 


0 1 ж k ж ж ck Ck ck ck 

1 ж ж ж 1 ж ж ж 1 ж ж ж 
А . 0 0 0 0 1 ж ж ж ck c 
001 * 0 0 1 * 0000 000001 + ж є ж 
0001 0000 0000 000000001* 

АП matrices of the following types are in reduced row echelon form: 

0 1* 000 *' 0 * 

* ж ж 
1000 100 10 000100»*»*9]9:* 
0100 0 l ug 0 1 * * 000010* є 0 * 
0010/7 001 *J/ 0000/7 000001* * 0 * 
000 1 000 0 0000 000000001* 


If, by a sequence of elementary row operations, the augmented matrix for a system of linear equations is put in reduced 
row echelon form, then the solution set can be obtained either by inspection or by converting certain linear equations to 
parametric form. Here are some examples. 


In Example 3 we could, if desired, express the 
solution more succinctly as the 4-tuple (3, —1, 0, 5). 


EXAMPLE 3 Unique Solution <4 


Suppose that the augmented matrix for a linear system in the unknowns х], x2, хз, and хд has been reduced 
by elementary row operations to 


1000 3 
0100 -1 
0010 0 
0001 5 
This matrix is in reduced row echelon form and corresponds to the equations 
х = 3 
X3 = =] 
X3 = 0 
x4 = 5 


Thus, the system has a unique solution, namely, x; = 3, x3 = = 1, x3 = 0, ха = 5. 


EXAMPLE 4 Linear Systems in Three Unknowns + 


In each part, suppose that the augmented matrix for a linear system in the unknowns x, y, and z has been 
reduced by elementary row operations to the given reduced row echelon form. Solve the system. 


1000 Lo 3-2] 1 -514 
(а)|0 12 0| |0 1 -4 2| ()IO 000 
0001 00 0 0 0 000 


Solution 
(2) The equation that corresponds to the last row of the augmented matrix is 
Ox + Oy + 0z= 1 
Since this equation is not satisfied by any values of x, y, and z, the system is inconsistent. 
(b 


МУ 


The equation that corresponds to the last row of the augmented matrix is 
Ox ++ Oy + 0z — 0 
This equation can be omitted since it imposes no restrictions on x, y, and z; hence, the linear system 
corresponding to the augmented matrix is 
x -3z = -=l 

y-4z = 2 
Since x and y correspond to the leading 1's in the augmented matrix, we call these the leading 
variables. The remaining variables (in this case z) are called free variables. Solving for the leading 
variables in terms of the free variables gives 

х= – 1 32 

у=2 +42 
From these equations we see that the free variable z can be treated as a parameter and assigned ап 


arbitrary value, t, which then determines values for x and y. Thus, the solution set can be represented 
by the parametric equations 


х= -1- 3, у=2 +4, z—t 

By substituting various values for t in these equations we can obtain various solutions of the system. 
For example, setting ¢ = () yields the solution 

х=—1, у=2, z=0 
and setting ¢ — ] yields the solution 

x—-—4, у=6, z=1 

(c) As explained in part (b), we can omit the equations corresponding to the zero rows, in which case the 

linear system associated with the augmented matrix consists of the single equation 


x—5y+z=4 (1) 


from which we see that the solution set is a plane in three-dimensional space. Although 1 is a valid 
form of the solution set, there are many applications in which it is preferable to express the solution 
set in parametric form. We can convert 1 to parametric form by solving for the leading variable x in 
terms of the free variables y and z to obtain 

x=4+5y =z 


From this equation we see that the free variables can be assigned arbitrary values, say y = s and z =f, 
which then determine the value of x. Thus, the solution set can be expressed parametrically as 


x=4+4+5s-f, y—s, z—t (2) 


We will usually denote parameters in a 
general solution by the letters r, s, t,..., but 
any letters that do not conflict with the names 
of the unknowns can be used. For systems 
with more than three unknowns, subscripted 
letters such as f1, t2, f3,... are convenient. 


Formulas, such as 2, that express the solution set of a linear system parametrically have some associated terminology. 


DEFINITION 1 


If a linear system has infinitely many solutions, then a set of parametric equations from which all solutions can 
be obtained by assigning numerial values to the parameters is called a general solution of the system. 


Elimination Methods 


We have just seen how easy it is to solve a system of linear equations once its augmented matrix is in reduced row 
echelon form. Now we will give a step-by-step elimination procedure that can be used to reduce any matrix to reduced 
row echelon form. As we state each step in the procedure, we illustrate the idea by reducing the following matrix to 
reduced row echelon form. 


00 -20 7 12 
24 —10 6 12 28 
24 —5 6 —5 -1 


Step 1. Locate the leftmost column that does not consist entirely of zeros. 

k 0 -2 0 7 12 
4 —10 6 12 2 

4 —5 6 —5 —1 


m 
мю N 


Leftmost nonzero column 


Step 2. Interchange the top row with another row, if necessary, to bring a nonzero entry to the top of the column found in 
Step 1. 

2 4 —10 6 12 28 

00 —2 0 7 12|- The first and second rows in the preceding matrix were interchanged. 

24 —5 6 —5 -l 


Step 3. If the entry that is now at the top of the column found in Step 1 is a, multiply the first row by 1/a in order to 
introduce a leading 1. 
12-553 6 14 
00 -20 7 12| «The first row ofthe preceding matrix was multiplied by 
24 —5 6 —5 -=l 


1 
ps 


Step 4. Add suitable multiples of the top row to the rows below so that all entries below the leading 1 become zeros. 


1 2 -5 3 6 14 
00 -20 7 12) -— —2 times the first row ofthe preceding matrix was added to the third row. 


00 50 -17 —29 


Step 5. Now cover the top row in the matrix and begin again with Step 1 applied to the submatrix that remains. Continue 
in this way until the entire matrix 1s in row echelon form. 


1 УЛ беке ЫМ, 6 14 


0 7 12 
0 —17 —29 


— 
— 
~ 
ju 
гс” Un N 


- Leftmost nonzero column 
in the submatrix 


l E 3 6 14 


The first row in the submatrix was 
А 1 
multiplied by — to introduce a leading 1. 


1 2 —5 3 6 14 
0 0 1 0 —1 —6 +——— —5 times the first row of the submatrix 
- was added to the second row of the 
0 0 0 0 1 1 submatrix to introduce a zero below the 
2 leading 1. 
1 2 —5 3 6 14 
0 0 1 0 - T —6 — The top row in the submatrix was covered, 
2 and we returned again to Step 1. 
0 0 0 O0 4 1 
t Leftmost nonzero column 


in the new submatrix 


1 2 -5 3 6 14 


0 0 1 0 2i —6 — The first (and only) row in the new 
- A submatrix was multiplied by 2 to introduce 
0 0 0 0 | Е a leading 1. 


The entire matrix is now in row echelon form. To find the reduced row echelon form we need the following additional 
step. 

Step 6. Beginning with the last nonzero row and working upward, add suitable multiples of each row to the rows above 
to introduce zeros above the leading 1'$. 


12 -5 3 6 14 
00 100 1| — i times the third row of the preceding matrix was added to the second row. 
00 001 2 
1 2 -5 3 0 
0 0 100 1 + = 6 times the third row was added to the first row. 
00 00 1 
120353207 
00100 1 «— 5 times the second row was added to the first row. 
000012 
The last matrix is in reduced row echelon form. 


The procedure (or algorithm) we have just described for reducing a matrix to reduced row echelon form is called Gauss- 
Jordan elimination. This algorithm consists of two parts, a forward phase in which zeros are introduced below the 
leading 1's and then a backward phase in which zeros are introduced above the leading 1's. If only the forward phase is 
used, then the procedure produces a row echelon form only and is called Gaussian elimination. For example, in the 
preceding computations a row echelon form was obtained at the end of Step 5. 


Carl Friedrich Gauss (1777-1855) 


E. 
УЕ “4 
RR 


^ 


Wilhelm Jordan (1842—1899) 


Historical Note Although versions of Gaussian elimination were known much earlier, the power of the method 
was not recognized until the great German mathematician Carl Friedrich Gauss used it to compute the orbit of 
the asteroid Ceres from limited data. What happened was this: On January 1, 1801 the Sicilian astronomer 
Giuseppe Piazzi (1746—1826) noticed a dim celestial object that he believed might be a “missing planet.” He 
named the object Ceres and made a limited number of positional observations but then lost the object as it neared 
the Sun. Gauss undertook the problem of computing the orbit from the limited data using least squares and the 
procedure that we now call Gaussian elimination. The work of Gauss caused a sensation when Ceres reappeared 


a year later in the constellation Virgo at almost the precise position that Gauss predicted! The method was further 
popularized by the German engineer Wilhelm Jordan in his handbook on geodesy (the science of measuring 
Earth shapes) entitled Handbuch der Vermessungskunde and published in 1888. 

[/mages: Granger Collection (Gauss); wikipedia (Jordan) | 


EXAMPLE 5 Gauss-Jordan Elimination + 


Solve by Gauss-Jordan elimination. 


x1 3x3 — 2x3 + 2x5 = 0 

2x1 + 6x2—5x3— 2xg+4x5— 3xg— —1 
5х3 + 10хд + 15х6 = 5 

2x1 + 6х2 + 8x4g+4x5 + 18х6= 6 


Solution The augmented matrix for the system is 


13-2 02 0 0 
26 —5 —2 4 -3 —1 
00 5 100 15 5 
26 0 84 18 6 


Adding —2 times the first row to the second and fourth rows gives 


13-2 02 0 O0 
00-1-20 -3 —1 
00 5 100 15 5 
00 4 80 18 6 


Multiplying the second row by —1 and then adding —5 times the new second row to the third row and —4 
times the new second row to the fourth row gives 


13-20200 
00 120531 
00 00000 
00 00062 


Interchanging the third and fourth rows and then multiplying the third row of the resulting matrix by i 
gives the row echelon form 


13-2020 0 
00 120353 


1 
00 00014 

3 
00 00000 


This completes the forward phase since there are zeros below the leading 1's . 


Adding —3 times the third row to the second row and then adding 2 times the second row of the resulting 
matrix to the first row yields the reduced row echelon form 


130420 0 

001200 0 

000001 1 This completes the backward phase since there are zeros above the leading 1'$. 
3 

000000 0 


The corresponding system of equations is 


х|++3х2 4х4 2х5 = 
хз 2x4 = 


woo 


(3) 
хв= 
Note that in constructing the linear system in 
3 we ignored the row of zeros in the 
corresponding augmented matrix. Why is this 
justified? 
Solving for the leading variables we obtain 
x|— = 3x3 = 4x4 = 2х5 
x3 = = 2X4 
1 
х6 = з 


Finally, we express the general solution of the system parametrically by assigning the free variables x2, x4, 
and xs arbitrary values r, s, and t, respectively. This yields 


хү= —3r—4s—2t, хә, x3— =25, хд=5, х=, = 


Homogeneous Linear Systems 


A system of linear equations is said to be homogeneous if the constant terms are all zero; that 1s, the system has the form 


а11х1 4122 +... јуху =D 
aax] +4222 +...+ азун = 0 
Gm]X] Ф 402х2 ... F ариу = 0 


Every homogeneous system of linear equations is consistent because all such systems have x; = 0, x3 = 0,..., ху = 0 as 
a solution. This solution is called the trivial solution; if there are other solutions, they are called nontrivial solutions. 


Because a homogeneous linear system always has the trivial solution, there are only two possibilities for its solutions: 
* The system has only the trivial solution. 
* The system has infinitely many solutions in addition to the trivial solution. 
In the special case of a homogeneous linear system of two equations in two unknowns, say 
ајх + 21у = 0 (61, bj not both zero) 
ax + ®у=0 (a2, b2 not both zero) 
the graphs of the equations are lines through the origin, and the trivial solution corresponds to the point of intersection at 
the origin (Figure 1.2.1). 


ax+by=0 
x 


ax *byz0 
and 
aX + у= 0 


a,x + Ьу = () 


| Only the trivial solution Infinitely many 
solutions 


Figure 1.2.1 


There is one case in which a homogeneous system is assured of having nontrivial solutions—namely, whenever the 
system involves more unknowns than equations. To see why, consider the following example of four equations in six 
unknowns. 


EXAMPLE 6 AHomogeneous System + 


Use Gauss-Jordan elimination to solve the homogeneous linear system 


х 3х9 = 2x3 + 2x5 = 0 
2x1 + 6x9 = 5х3 — 2х4 4х5 = 3x6 = 0 

5х3 + 10х4 + 15х6 = 0 (4) 
2x1 + 6х2 H хд 4x5 + 18xg = 0 


Solution Observe first that the coefficients of the unknowns in this system are the same as those in 
Example 5; that is, the two systems differ only in the constants on the right side. The augmented matrix for 
the given homogeneous system is 


3-2 02 00 
6 59. 4 3-0 
0 5 100 150 (5) 
6 0 84 180 


омо м н 


which is the same as the augmented matrix for the system in Example 5, except for zeros in the last 
column. Thus, the reduced row echelon form of this matrix will be the same as that of the augmented 
matrix in Example 5, except for the last column. However, a moment's reflection will make it evident that 
a column of zeros is not changed by an elementary row operation, so the reduced row echelon form of 5 is 


1304200 
0 0 0 0 

0 0 10 (6) 
0 0 0 0 

The corresponding system of equations is 


х|\+3х2  -4x4-42x5 =0 


хз 2x4 =0 
хв=0 
Solving for the leading variables we obtain 
x1 = = 3x3 = 4x4 = 2х5 
x3 = = 2X4 (7) 
xg =0 


If we now assign the free variables хә, x4, and xs arbitrary values r, s, and t, respectively, then we can 


express the solution set parametrically as 
x1 = —3r—4s—2t, хә, х= —2s5, x4—s, x5=t, xg-—0 


Note that the trivial solution results when р = g = ¢ = 0). 


Free Variable in Homogeneous Linear Systems 


Example 6 illustrates two important points about solving homogeneous linear systems: 


1. Elementary row operations do not alter columns of zeros in a matrix, so the reduced row echelon form of the 
augmented matrix for a homogeneous linear system has a final column of zeros. This implies that the linear system 
corresponding to the reduced row echelon form is homogeneous, just like the original system. 


2. When we constructed the homogeneous linear system corresponding to augmented matrix 6, we ignored the row of 
zeros because the corresponding equation 
Oxy + Оха + 0х3 + Ox4 + Ox 5 + Охе = 0 
does not impose any conditions on the unknowns. Thus, depending on whether or not the reduced row echelon form 
of the augmented matrix for a homogeneous linear system has any rows of zero, the linear system corresponding to 
that reduced row echelon form will either have the same number of equations as the original system or it will have 
fewer. 


Now consider a general homogeneous linear system with n unknowns, and suppose that the reduced row echelon form of 
the augmented matrix has r nonzero rows. Since each nonzero row has a leading 1, and since each leading 1 corresponds 
to a leading variable, the homogeneous system corresponding to the reduced row echelon form of the augmented matrix 
must have r leading variables and 4 — r free variables. Thus, this system is of the form 


Х| і У()=0 
Xka L $20 20 (8) 
iie 2 ae 


where in each equation the expression }>() denotes a sum that involves the free variables, if any [see 7, for example]. In 
summary, we have the following result. 


THEOREM 1.2.1 Free Variable Theorem for Homogeneous Systems 


If a homogeneous linear system has n unknowns, and if the reduced row echelon form of its augmented matrix 
has r nonzero rows, then the system has n - r free variables. 


Note that Theorem 1.2.2 applies only to 
homogeneous systems—a nonhomogeneous system 
with more unknowns than equations need not be 
consistent. However, we will prove later that if a 
nonhomogeneous system with more unknowns then 
equations is consistent, then it has in infinitely many 
solutions. 


Theorem 1.2.1 has an important implication for homogeneous linear systems with more unknowns than equations. 
Specifically, if a homogeneous linear system has m equations in n unknowns, and if jj; < з, then it must also be true that 
г< (why?). This being the case, the theorem implies that there is at least one free variable, and this implies in turn that 
the system has infinitely many solutions. Thus, we have the following result. 


THEOREM 1.2.2 


A homogeneous linear system with more unknowns than equations has infinitely many solutions. 


In retrospect, we could have anticipated that the homogeneous system in Example 6 would have infinitely many 
solutions since it has four equations in six unknowns. 


Gaussian Elimination and Back-Substitution 


For small linear systems that are solved by hand (such as most of those in this text), Gauss-Jordan elimination (reduction 
to reduced row echelon form) is a good procedure to use. However, for large linear systems that require a computer 
solution, it is generally more efficient to use Gaussian elimination (reduction to row echelon form) followed by a 
technique known as back-substitution to complete the process of solving the system. The next example illustrates this 
technique. 


EXAMPLE 7 Example 5 Solved by Back-Substitution — 


From the computations in Example 5, a row echelon form of the augmented matrix is 


13-2020 0 
00 1203 1 
1 
00 0001 3 
00 0000 0 
To solve the corresponding system of equations 
x1 + 3x2 = 2x3 H 2х5 = 0 
хз 2х4 + 3xg= 1 
ані 
we proceed as follows: 
Step 1. Solve the equations for the leading variables. 
x1 = = 3x3 + 2x3 = 2x5 
x3 = l] = 2x4 = 3x6 
sd 
X673 


Step 2. Beginning with the bottom equation and working upward, successively substitute each equation 
into all the equations above it. 


Substituting xg — i into the second equation yields 


x1 = = 3x7 + 2x3 = 2х5 


x3 = —2x4 
1 
== 
е5 
Substituting x3 = — 2х4 into the first equation yields 
x|— —3x3—4x4—2x5 
хз= = 2x4 
1 
xs=5 
673 


Step 3. Assign arbitrary values to the free variables, if any. 
If we now assign x2, хд, and xs the arbitrary values r, s, and t, respectively, the general solution is given by 
the formulas 

x1 = —3r—4s—2ft, х=, x3— —2s5, x4—s, Xxs5—Íí, х= 


This agrees with the solution obtained in Example 5. 


EXAMPLES8 ^4 


Suppose that the matrices below are augmented matrices for linear systems in the unknowns x1, x2, x3, and 
X4. These matrices are all in row echelon form but not reduced row echelon form. Discuss the existence 
and uniqueness of solutions to the corresponding linear systems 


1 -37 25 1 -37 25 1 -37 25 
0 12-41 0 12-41 0 12-41 
(Qo o1 69| Plo o1 6 9] Plo o1 eo 
0 00 01 0 00 00 0 00 10 


Solution 
(a) The last row corresponds to the equation 
Oxy + 0x3 + 0x3 + 0x 4— 1 
from which it is evident that the system 15 inconsistent. 
(b) The last row corresponds to the equation 
Oxy + 0x3 + 0х3 + 0x4 — 0 
which has no effect on the solution set. In the remaining three equations the variables x1, x2, and x3 
correspond to leading 1's and hence are leading variables. The variable x4 is a free variable. With a 


little algebra, the leading variables can be expressed in terms of the free variable, and the free variable 
can be assigned an arbitrary value. Thus, the system must have infinitely many solutions. 


(c) The last row corresponds to the equation 
x4=0 
which gives us a numerical value for x4. If we substitute this value into the third equation, namely, 
x3+6x4=9 
we obtain x4 = 9. You should now be able to see that if we continue this process and substitute the 


known values of x5 and x4 into the equation corresponding to the second row, we will obtain a unique 
numerical value for x2; and if, finally, we substitute the known values of x4, x3, and x2 into the 


equation corresponding to the first row, we will produce a unique numerical value for х1. Thus, the 
system has a unique solution. 


Some Facts About Echelon Forms 


There are three facts about row echelon forms and reduced row echelon forms that are important to know but we will not 

prove: 

1. Every matrix has a unique reduced row echelon form; that is, regardless of whether you use Gauss-Jordan elimination 
or some other sequence of elementary row operations, the same reduced row echelon form will result in the end.” 

2. Row echelon forms are not unique; that is, different sequences of elementary row operations can result in different 
row echelon forms. 

3. Although row echelon forms are not unique, all row echelon forms of a matrix A have the same number of zero rows, 
and the leading 1's always occur in the same positions in the row echelon forms of A. Those are callled the pivot 
positions of A. A column that contains a pivot position is called a pivot column of A. 


EXAMPLE 9 Pivot Positions and Columns + 


Earlier in this section (immediately after Definition 1) we found a row echelon form of 
00 -20 7 12 
А=|2 4 —10 6 12 28 
24 —5 6 —5 -l 


to be 
12-5 3 14 


6 
La 
00 10-5 -6 
00 00 1 2 


The leading 1's occur in positions (row 1, column 1), (row 2, column 3), and (row 3, column 5). These are 
the pivot positions. The pivot columns are columns 1,3, and 5. 


Roundoff Error and Instability 


There is often a gap between mathematical theory and its practical implementation—Gauss-Jordan elimination and 
Gaussian elimination being good examples. The problem is that computers generally approximate numbers, thereby 
introducing roundoff errors, so unless precautions are taken, successive calculations may degrade an answer to a degree 
that makes it useless. Algorithms (procedures) in which this happens are called unstable. There are various techniques 
for minimizing roundoff error and instability. For example, it can be shown that for large linear systems Gauss-Jordan 
elimination involves roughly 50% more operations than Gaussian elimination, so most computer algorithms are based on 
the latter method. Some of these matters will be considered in Chapter 9. 


Concept Review 


Reduced row echelon form 


Row echelon form 


Leading 1 


Leading variables 


Free variables 


General solution to a linear system 


Gaussian elimination 


Gauss-Jordan elimination 


Forward phase 
Backward phase 


Homogeneous linear system 


Trivial solution 


Nontrivial solution 


Dimension Theorem for Homogeneous Systems 


Back-substitution 


Skills 
* Recognize whether a given matrix is in row echelon form, reduced row echelon form, or neither. 


* Construct solutions to linear systems whose corresponding augmented matrices that are in row echelon form or 
reduced row echelon form. 


* Use Gaussian elimination to find the general solution of a linear system. 
* Use Gauss-Jordan elimination in order to find the general solution of a linear system. 


* Analyze homogeneous linear systems using the Free Variable Theorem for Homogeneous Systems. 


Exercise Set 1.2 


1. In each part, determine whether the matrix is in row echelon form, reduced row echelon form, both, or neither. 


(ay pi 00 
010 
001 

(0|10 0 
010 
000 

(с) |0 1 0 
001 
000 

(д0 |1031 
0124 


N 


G2 


(e) 


(f 


[1-755 
0 132 


Answer: 


м ооо оо о н 
oo o coo OWN 


(a) Both 
(b) Both 
(c) Both 
(d) Both 
(е) Both 
(f) Both 
(g) Row echelon 


. In each part, determine whether the matrix is in row echelon form, reduced row echelon form, both, or neither. 


(а) |1 2 0 
010 
000 

(b)|1 0 0 
01 0 
020 

(c) |1 3 4 
00 1 
000 

(@ |15 = 
0 1 1 
00 0 

(е) |1 2 3 
00 0 
00 1 

(|12345 
107123 
00001 
00000 

(p|l-20 1 
0 01 -2 


. In each part, suppose that the augmented matrix for a system of linear equations has been reduced by row operations 


to the given reduced row echelon form. Solve the system. 


(а) 


| 
` 
| 


"bab dus 


— 4 со 
| 
— Or © кє me O кє WA AN 
м UJ) DH 


© w € cO 


© мл uU 


(с) 


Coot оно 
| 


(d) 


oor OFC Оо юе © Оо юс COCO н 
oha оо н м 


о н 


Answer: 


(a) *1= —37, x2= —8, х3= 5 

(b) x1 = 137 — 10, x5 = 13£—5, x32 —t- 2, x4—t 

(c) X317 78+ 2t — 1l, x3— 5, x3— —3t—4, x4— —3t +9, х= 
(d) Inconsistent 


4. In each part, suppose that the augmented matrix for a system of linear equations has been reduced by row operations 
to the given reduced row echelon form. Solve the system. 


@ [1 0 0 -3 
010 0 
001 7 

i» [1-0 0-7. g 
010 3 2 
Ü-U-T "3525 

(dne 0-0 3-50 
ÜU 0104 7 
0 0015 8 
0 0000 0 

(b[1 -3 0 0 
0 010 
0 001 


In Exercises 5—8, solve the linear system by Gauss-Jordan elimination. 


5. XpX242x3 = 8 
=x] — 2x7 + 3х3 = 
3x1 = 7x2 + 4x3 = 10 


Answer: 


xj23, x22], хз=2 


6. 2x, + 2х3 + 2х3 = 0 
=2кү + 5х2 + 2х3 = 

8xy+xg+4x3 = =l 

7, x= уф22- w= – 1 

2х+ y —-22—2w — —2 

—x-F2y —4z-- w= ] 

3x —3w--—3 


Answer: 


x—t—l,y-—2s,z-—s,w-—t 


8. =2b + 3c = 1 
3a+6b—3c = —2 
6a--6b--3c = 5 


In Exercises 9-12, solve the linear system by Gaussian elimination. 
9. Exercise 5 


Answer: 


х= 3, x2— 1, x3—2 
10. Exercise 6 


11. Exercise 7 
Answer: 


x—t—l,y-—2sz-—s,w-t 


12. Exercise 8 


In Exercises 13—16, determine whether the homogeneous system has nontrivial solutions by inspection (without pencil 
and paper). 


13. 2x1 = 3x3 + 4х3 = x4 = 0 
Тхүр x32—8x3--9x4 = 0 
2x, +8x2 + хз— x4 = 0 


Answer: 


Has nontrivial solutions 


14. x1 +3x2=x3 = 0 
x= 8х3 = 0 
4x3 = 0 


15. 411%1 + @12Х2 + a13x3 = 0 
а2\Х\ + a23X2 + a33X3 = 0 


Answer: 


Has nontrivial solutions 
16. 3x1 — 2x2 = 0 
бху = 4х2 = 0 


In Exercises 17—24, solve the given homogeneous linear system by any method. 


17. 2x1 + x2+3x3 = 0 
x1 + 2x2 = 0 
0 


Answer: 


хр= 0, x3—0, x3=0 
18. 2x— у= 32 = 0 
=x + 2y = 3z 0 
х+ y+4z = 0 


19. 3x1 + х2 + х3 + х4= 0 


5x; = х х3 = х4 0 
Answer: 


x1 = —s, X;— —í—s, x3—4Às, x4—ft 
20. y + 3w = 2x 
2и + v—4w-F3x = 0 

2u + Зу + 20 x = 0 
—4u —3v--5w—4x = 0 
21. ax+2y+4z = 0 
0 
0 
0 


| 
© 


w = y= 32 


2w+3x+ yt 2 = 
—2w+ х 3у = 22 = 


Answer: 


w-—fí,x-—-—t,y-ti,z-ü 
22. х]+3хә x4 = 0 
ху 4x3 2x3 == 0 
—2x;—2x3—x4 = 0 

2x1 — 4x2 + x3-c x4 = 0 

x = 2x2 = x3 x4 = 0 


23.21] = h + 313 + 414 = 9 
iy —2i3+7i4 = 11 
З – 313+ 13+ 514 = 8 
2i + 19 +413 +4414 = 10 
Answer: 
= —1, h =0, =l, 14-2 
24. Z44- Z44+Z5=0 
—Z,- 272++273—374++75=0 
21+ Z3— 223 —Zs-—Ü 


221+ 225 = 23 + 25= 0 


In Exercises 25—28, determine the values of a for which the system has no solutions, exactly one solution, or infinitely 


many solutions. 


25. х+2у— 28 4 
Зх = y+ 2z = 2 
4х + y+ (а? = м} = 4+2 
Answer: 
If а = 4, there are infinitely many solutions; if g = = 4, there are no solutions; if g + + 4, there is exactly one 
solution. 
26. х + 25у + zZ m2 
2x = 2y + 3z = 1 
х+2у- (a?—3) — a 
27. х+ 2y = 1 
2x + (22-5) = 2-1 
Answer: 
If а = 3, there are infinitely many solutions; if 4 = — 3, there are no solutions; if g + + 3, there is exactly one 
solution. 
28. x+ y4 zz = =? 
2x + Зу + liz = —16 
x +2y + (а? Fly = 3a 


In Exercises 29—30, solve the following systems, where a, b, and с are constants. 


29. 2x + y —a 
Зх + бу =b 
Answer: 
222 b „_ а 2b 
Eg aux Tg 
30.1 х2 + х3 = а 
2x1 2х3 = b 
3х2+ 3х3 = c 


31. Find two different row echelon forms of 


21] 


This exercise shows that a matrix сап have multiple row echelon forms. 


Answer: 
PEE 


32. Reduce 


0 —2 —29 
3 4 5 
to reduced row echelon form without introducing fractions at any intermediate stage. 
33. Show that the following nonlinear system has 18 solutions if 0 < a < 20,0 < y € 27, and 0 < y < 2. 
sna+2cosG+3tany = 0 
2sna+5cosG@+3tany = 0 
0 


=sna—S5cosf+itany = 


[Hint: Begin by making the substitutions x = sin оь y = cos ĝ, and z = tan ¥.] 


34. Solve the following system of nonlinear equations for the unknown angles a, В, and y, where 0 < a < 27, 
0«& B «2s, and 0 € 4 « s. 


2sna— cosG+3tany = 3 
4sna+2cosG—2tany = 2 
6sna—3cosG+ tany = 9 


35. Solve the following system of nonlinear equations for x, y, and z. 


x? y? + 2? = 


[Hint: Begin by making the substitutions ¥ — х2, Y=y*, 7 =22.] 


Answer: 


х= +], y= +03, z= +2 


36. Solve the following system for x, y, and z. 


Lig 4 
zy 2 1 
2,3%. 8 
Xx y т 0 
1,9, 10 

n kx kc: = 5 


37. Find the coefficients a, b, c, and d so that the curve shown in the accompanying figure is the graph of the equation 
y — ax? -- bx? +x +4. 


Figure Ex-37 


Answer: 


q—1,b—-—6,c—2,d—10 
38. Find the coefficients a, b, c, and d so that the curve shown in the accompanying figure 1s given by the equation 
ax? + ay? + ёх -F cy +a = 0. 


Figure Ех-38 


39. If the linear system 


ayx+byy+eyz = 0 
agx—boy+egz = 0 
a3x-Fbay —eaz = 0 

has only the trivial solution, what can be said about the solutions of the following system? 
ајх Ру +c = 3 
азх = by Hez = 7 
a3x--Fbay = cz = 11 


Answer: 


The nonhomogeneous system will have exactly one solution. 


40. (a) If Ais a3 x 5 matrix, then what is the maximum possible number of leading 1's in its reduced row echelon form? 


(b) If Bis a 3 x 6 matrix whose last column has all zeros, then what is the maximum possible number of parameters 
in the general solution of the linear system with augmented matrix B? 


(c) If C isa 5 x 3 matrix, then what is the minimum possible number of rows of zeros in any row echelon form of 
C? 


41. (a) Prove that if 44 — bc # 0, then the reduced row echelon form of 


ЖЫ! 


(b) Use the result in part (а) to prove that if 44 — bc # Q, then the linear system 


ах+фу=К 
ex+dy=! 
has exactly one solution. 
42. Consider the system of equations 
ax+ by = 0 
ex--dy = 0 
ex+ fy = 0 


Discuss the relative positions of the lines gx + by = 0, сх + dy = 0, and ex + £y = 0 when (a) the system has 
only the trivial solution, and (b) the system has nontrivial solutions. 


43. Describe all possible reduced row echelon forms of 


(а) |a bc 
d e f 
gh i 

(|a bcd 
e f gh 
Poj ki 
mn ра 


True-False Exercises 


In parts (a)(i) determine whether the statement is true or false, and justify your answer. 
(a) If a matrix is in reduced row echelon form, then it is also in row echelon form. 
Answer: 


True 


(b) If an elementary row operation is applied to a matrix that is in row echelon form, the resulting matrix will still be in 
row echelon form. 


Answer: 


False 


(c) Every matrix has a unique row echelon form. 
Answer: 


False 


(d) A homogeneous linear system in n unknowns whose corresponding augmented matrix has a reduced row echelon 
form with r leading 1's has n — r free variables. 


Answer: 


True 


(e) All leading 1's in a matrix in row echelon form must occur in different columns. 
Answer: 
True 

(f) If every column of a matrix in row echelon form has a leading 1 then all entries that are not leading 1's are zero. 
Answer: 


False 


(g) If a homogeneous linear system of n equations in n unknowns has a corresponding augmented matrix with a reduced 
row echelon form containing л leading 1's, then the linear system has only the trivial solution. 


Answer: 


True 


(h) If the reduced row echelon form of the augmented matrix for a linear system has a row of zeros, then the system must 
have infinitely many solutions. 


Answer: 


False 


(i) If a linear system has more unknowns than equations, then it must have infinitely many solutions. 
Answer: 


False 


Copyright (O 2010 John Wiley & Sons, Inc. All rights reserved. 


1.3 Matrices and Matrix Operations 


Rectangular arrays of real numbers arise in contexts other than as augmented matrices for linear systems. In this 
section we will begin to study matrices as objects in their own right by defining operations of addition, subtraction, 
and multiplication on them. 


Matrix Notation and Terminology 


In Section 1.2 we used rectangular arrays of numbers, called augmented matrices, to abbreviate systems of linear 
equations. However, rectangular arrays of numbers occur in other contexts as well. For example, the following 
rectangular array with three rows and seven columns might describe the number of hours that a student spent studying 
three subjects during a certain week: 


Mon. Tues. Wed. Thurs. Fri. Sat. Sun. 
Math 2 3 2 4 1 4 2 


History 0 3 1 4 3 2 2 
Language 4 1 3 1 0 0 2 


If we suppress the headings, then we are left with the following rectangular array of numbers with three rows and 
seven columns, called a “matrix”: 


23241 
0314353 
4131 0 


More generally, we make the following definition. 


DEFINITION 1 


A matrix is a rectangular array of numbers. The numbers in the array are called the entries in the matrix. 


A matrix with only one column is called a column 
vector or a column matrix, and a matrix with only 
one row is called a row vector or a row matrix. In 
Example 1, the 2 x | matrix is a column vector, the 
1 х4 matrix is a row vector, and the ] x | matrix 
is both a row vector and a column vector. 


EXAMPLE 1 Examples of Matrices + 


Some examples of matrices are 


oO м 
— 
tho 
— 
о 
| 
ША] 
— 
о 
o N|- = 
— 
Ej 
D) — 
L1 
— 
4 
i 


The size of a matrix is described in terms of the number of rows (horizontal lines) and columns (vertical lines) it 
contains. For example, the first matrix in Example 1 has three rows and two columns, so its size is 3 by 2 (written 

3 x 2). In a size description, the first number always denotes the number of rows, and the second denotes the number 
of columns. The remaining matrices in Example | have sizes 1 x 4, 3 x 3, 2 x 1, and ] x 1, respectively. 


We will use capital letters to denote matrices and lowercase letters to denote numerical quantities; thus we might write 
2. 173 a bc 
А= or C= 
342 d e f 
When discussing matrices, it is common to refer to numerical quantities as scalars. Unless stated otherwise, scalars 
will be real numbers; complex scalars will be considered later in the text. 


Matrix brackets are often omitted from 1 x 1 
matrices, making it impossible to tell, for example, 
whether the symbol 4 denotes the number “four” or 
the matrix [4]. This rarely causes problems because 
it is usually possible to tell which is meant from the 
context. 


The entry that occurs in row i and column j of a matrix A will be denoted by ау. Thus a general 3 x 4 matrix might be 
written as 
011 812 @13 @14 
A=/421 d23 23 224 
031 432 433 934 


and a general jj; x у matrix as 


а1] а) "'' а 
а d32 """ аў 

А=| i; p (1) 
Cm] Am """ Cmm 


When a compact notation is desired, the preceding matrix can be written as 

[2:5] mxn °F [2] 
the first notation being used when it is important in the discussion to know the size, and the second being used when 
the size need not be emphasized. Usually, we will match the letter denoting a matrix with the letter denoting its 
entries; thus, for a matrix В we would generally use bj; for the entry in row i and column j, and for a matrix C we 
would use the notation су. 


The entry in row i and column j of a matrix A is also commonly denoted by the symbol (A)j. Thus, for matrix 1 
above, we have 
(А) ij 5 aij 


and for the matrix 
2 —3 
А= 
we have (А) 11 = 2, (А) 139— = 3, (43 = T and (А)22 = 0. 
Row and column vectors are of special importance, and it is common practice to denote them by boldface lowercase 


letters rather than capital letters. For such matrices, double subscripting of the entries is unnecessary. Thus a general 
1 x x row vector a and a general зр; x; 1 column vector b would be written as 


by 
а= [422° 23] and b= 
bm 
A matrix A with n rows and n columns is called a square matrix of order n, and the shaded entries a41, 422, --.. Фуу 
in 2 are said to be on the main diagonal of A. 


а ау -c:* Gin 


421 —422..-** In 


Q) 


E an2 `> nn 


Operations on Matrices 


So far, we have used matrices to abbreviate the work in solving systems of linear equations. For other applications, 
however, it is desirable to develop an “arithmetic of matrices” in which matrices can be added, subtracted, and 
multiplied in a useful way. The remainder of this section will be devoted to developing this arithmetic. 


DEFINITION 2 


Two matrices are defined to be equal if they have the same size and their corresponding entries are equal. 


The equality of two matrices 
A= [aij] and B= [by] 
of the same size can be expressed either by writing 
(А) m "m 
or by writing 
ар = by 
where it is understood that the equalities hold for 
all values of i and j. 


EXAMPLE 2 Equality of Matrices + 


2 1 2 1 210 
НИ =s s} ЫН 
If x = 5, then 4 = B, but for all other values of x the matrices A and В are not equal, since not all of 


their corresponding entries are equal. There is no value of x for which А = ¢ since A and C have 
different sizes. 


Consider the matrices 


DEFINITION 3 


If A and В are matrices of the same size, then the sum А 4- B is the matrix obtained by adding the entries of В 
to the corresponding entries of А, and the difference А — Б 15 the matrix obtained by subtracting the entries of 
B from the corresponding entries of А. Matrices of different sizes cannot be added or subtracted. 


In matrix notation, if A = [а;;] and 2 = [5j] have the same size, then 


(A+ В): = (А); + By = ai + bi and (А— В), = (А) — (В), = ау bi 


EXAMPLE 3 Addition and Subtraction <4 


Consider the matrices 


103 -4 3 5 11 
A=|—1 02 4|, B= 2 2 0 - с= |; 4 
—2 7 0 3 2 -4 
Then 
—2 45 4 6 —2 —5 2 
А+ В = 122 3| and А—8=|—3 —2 2 5 
703 5 1—4 11 -5 


The expressions A+ C, 8 + C, 4— C, and B — Œ are undefined. 


DEFINITION 4 


If A is any matrix and c is any scalar, then the product cA is the matrix obtained by multiplying each entry of 
the matrix A by c. The matrix cA is said to be a scalar multiple of A. 


In matrix notation, if A = [@,;], then 


(cA) g= c(A) ij = c4 


EXAMPLE 4 Scalar Multiples + 


For the matrices 
we have 
It is common practice to denote (— 1)B by —B. 


Thus far we have defined multiplication of a matrix by a scalar but not the multiplication of two matrices. Since 
matrices are added by adding corresponding entries and subtracted by subtracting corresponding entries, it would 
seem natural to define multiplication of matrices by multiplying corresponding entries. However, it turns out that such 
a definition would not be very useful for most problems. Experience has led mathematicians to the following more 


useful definition of matrix multiplication. 


DEFINITION 5 


If A is an р x r matrix and B is an p x » matrix, then the product AB is the jz x у matrix whose entries аге 
determined as follows: To find the entry in row i and column j of AB, single out row i from the matrix A and 
column j from the matrix В. Multiply the corresponding entries from the row and column together, and then 


add up the resulting products. 


EXAMPLE 5 Multiplying Matrices + 


Consider the matrices 


Since A is a 2 x 3 matrix and B is a 3 x 4 matrix, the product AB is a 2 x 4 matrix. To determine, for 
example, the entry in row 2 and column 3 of AB, we single out row 2 from A and column 3 from B. 
Then, as illustrated below, we multiply corresponding entries together and add up these products. 


12 аз БО OO 
26, 752| (UU ps] U 


(2-4) + (6-3) + (0-5) =26 


The entry in row 1 and column 4 of AB is computed as follows: 


(1:3) + (2:1) + (4-2) = 13 
The computations for the remaining entries are 

(1.4) + (2.0) + (4.2) = 12 

(1.1) = (2.1) + (4.7) = 27 

(1.4) + (2.3) + (4.5) = 30 Е E 27 30 А 

(2.4) + (6.0) + (0.2) = 8 8 —4 26 12 
(2.1) = (6.1) + (0.7) = = 

(2.3) + (6.1) + (0.2) = 12 


The definition of matrix multiplication requires that the number of columns of the first factor А be ће same as ће 
number of rows of the second factor B in order to form the product AB. If this condition is not satisfied, the product is 
undefined. A convenient way to determine whether a product of two matrices is defined is to write down the size of 
the first factor and, to the right of it, write down the size of the second factor. If, as in 3, the inside numbers are the 
same, then the product is defined. The outside numbers then give the size of the product. 


A B AB 


| Inside | (3) 


Outside 


4 


Gotthold Eisenstein (1823—1852) 


Historical Note The concept of matrix multiplication is due to the German mathematician Gotthold 
Eisenstein, who introduced the idea around 1844 to simplify the process of making substitutions in linear 
systems. The idea was then expanded on and formalized by Cayley in his Memoir on the Theory of Matrices 
that was published in 1858. Eisenstein was a pupil of Gauss, who ranked him as the equal of Isaac Newton 
and Archimedes. However, Eisenstein, suffering from bad health his entire life, died at age 30, so his potential 
was never realized. 

[Image: wikipedia] 


EXAMPLE 6 Determining Whether 


a Product Is Defined — 


Suppose that А, B, and C are matrices with the following sizes: 


А 
3х4 


В C 
4x7 7x35 


Then by 3, AB is defined and is a 3 x 7 matrix; BC is defined and is a 4 x 3 matrix; and CA is defined 
and is a 7 x 4 matrix. The products AC, CB, and BA are all undefined. 


In general, if A= [aj] is an jy x p matrix and 8 = [5j] is an p x » matrix, then, as illustrated by the shading in 4, 


а1| d12 Air 


а 822 a» | ^n 212 bj bin 
AB=| an an M f s fi Е (4) 
ре RA e én bn by brm 
the entry (42) ;; in row i and column j of AB is given by 
САВ) = aib Hanba + 43363; 4 Бау (5) 


Partitioned Matrices 


A matrix can be subdivided or partitioned into smaller matrices by inserting horizontal and vertical rules between 
selected rows and columns. For example, the following are three possible partitions of a general 3 x 4 matrix 4—the 
first is a partition of A into four submatrices A11, A12, А, and A22; the second is a partition of A into its row vectors 


rı, r2, and гз; and the third is a partition of A into its column vectors с], c2, єз, and сд: 


&13 
423 
433 


&12 
222 
432 


11 
А= |221 
2412 
222 
232 


413 
423 
233 
413 
423 
433 


212 
422 
432 


a14 
Aqu di 
ки imd FOE 
азд 21 422 
aig ri 
224 | = |Y2 
234 r3 
214 
424 |= [91 €2 €3 €4] 
a34 


Matrix Multiplication by Columns and by Rows 


Partitioning has many uses, one of which is for finding particular rows or columns of a matrix product АВ without 


computing the entire product. Specifically, the following 


formulas, whose proofs are left as exercises, show how 


individual column vectors of AB can be obtained by partitioning B into column vectors and how individual row 


vectors of АВ can be obtained by partitioning А into row 


vectors. 


AB—A[b, b2 +++ Ъ,]= [А 4b; >- Abn] 


6 

(АВ computed column by column) (6) 

a1 aj 

АВ аз В азд 
` (7) 

am ад 

(АВ computed row by row) 
In words, these formulas state that 

Jj th column vector of А8 = A[ j th column vector of 5] (8) 
ith row vector of AS = [i th row vector of 4]5 (9) 


EXAMPLE 7 Example 5 Revisited + 


If A and В are the matrices in Example 5, then from 8 the second column vector of АВ can be obtained 


by the computation 
i 2 J Е Б 
260 j a 


1 1 
Second column of В Second column of AB 
and from 9 the first row vector of AB can be obtained by the computation 


4 | 4 3 

[12 4]0 -1 3 1] = 12 27 30 13] + 
2 7 5 2 

First row of A First row of AB — 


Matrix Products as Linear Combinations 


We have discussed three methods for computing a matrix product AB—entry by entry, column by column, and row by 
row. The following definition provides yet another way of thinking about matrix multiplication. 


DEFINITION 6 


If Aj, 45, .., Ay are matrices of the same size, апа Ёс, ¢3, ..., c, are scalars, then an expression of the 


form 


C144 4- 0242 + - 7 + cH, 
is called a linear combination of Ау, Аз, ..., A, with coefficients су, сз, ..., Cy 


To see how matrix products can be viewed as linear combinations, let A be ап; x д matrix and x an » x | column 
vector, say 


1] O12 *** |n Х| 
д=| 9m cic 9» og х= |2 
aml ат? mn Ху 
Then 
11] + a12X2 cb cc + аху a1 212 G1» 
a 2*1 + 22252 tct iva =x 821 x2 422 | +x, са (10) 
apii + max? gow К. з "m Aml Am2 Amn 


This proves the following theorem. 
THEOREM 1.3.1 


If A is an их x matrix, and if x is an y x | column vector, then the product Ax can be expressed as a linear 
combination of the column vectors of A in which the coefficients are the entries of x. 


EXAMPLE 8 Matrix Products as Linear Combinations — 


The matrix product 
-13 2 2 1 
12 -3||-1|2|-9 
2 1 —2 3 -3 
can be written as the following linear combination of column vectors 
—1 3 2 1 
2| 1|-1|2|43| 23| 2| -9 
2 1 —2 -3 


EXAMPLE 9 Columns of a Product AB as Linear Combinations 


We showed in Example 5 that 


[12 27 30 13 
| 8 -4 26 12 


It follows from Formula 6 and Theorem 1.3.1 that the j th column vector of АВ can be expressed as a 
linear combination of the column vectors of А in which the coefficients in the linear combination are the 
entries from the j th column of B. The computations are as follows: 


L8] "^| +06 ^o 


[a] fj- fele 
[as] -*b| ^is] +11 
е] [leee 


Matrix Form of a Linear System 


Matrix multiplication has an important application to systems of linear equations. Consider a system of m linear 
equations in n unknowns: 


QjjX) + ахо Hess + ахы =), 
aX, + а@2Х2 Best + a24X4  —b3 
AamiX, + amX2 bit + GwwX4 = 


Since two matrices are equal if and only if their corresponding entries are equal, we can replace the m equations in 
this system by the single matrix equation 


й1\Х| + 12х72 Ht tt P a yXy, by 
a23)X1 + a22X2 +++ + amn | |22 
AmiI P GwaX2 htt P Amnn by 


The зу x 1 matrix on the left side of this equation can be written as a product to give 


ay 412 *'* а | bi 
аз 422 сс аһ | X2| | | b 
Cm] Am ``“ Cmn || Xn bm 


If we designate these matrices by A, x, and b, respectively, then we can replace the original system of m equations in 
n unknowns has been replaced by the single matrix equation 


Ax—b 


The matrix А in this equation is called the coefficient matrix of the system. The augmented matrix for the system is 
obtained by adjoining b to А as the last column; thus the augmented matrix is 


aj ау ^77 аы|#1 
42) 422 `` ази |р 
[4b]—| ; ; : : 


Am Am2 "'' ать, 


The vertical bar in [A|b] is a convenient way to 
separate A from b visually; it has no mathematical 
significance. 


Transpose of a Matrix 


We conclude this section by defining two matrix operations that have no analogs in the arithmetic of real numbers. 


DEFINITION 7 


If A is any з x у matrix, then the transpose of A, denoted by AT, is defined to be the з ж зу matrix that results 


by interchanging the rows and columns of A; that is, the first column of АТ is the first row of A, the second 


column of AT is the second row of A, and so forth. 


EXAMPLE 10 Some Transposes <& 


The following are some examples of matrices and their transposes. 


411 412 413 214 2 3 
А=|@21 an az an|, B=|1 4|, С=[1 3 5], 
а3] 43) 433 азд 5 6 


211 @21 d3| 


413 423 433 
a14 424 434 


1 
a.m ов бш sei cres» 
5 


D- [4] 


Observe that not only are the columns of A’ the rows of A, but the rows of A’ are the columns of A. Thus the entry in 


row i and column j of A’ is the entry in row j and column i of A; that is, 
Ti _ " 
yes 


Note the reversal of the subscripts. 


(11) 


In the special case where А is a square matrix, the transpose of A can be obtained by interchanging entries that are 


symmetrically positioned about the main diagonal. In 12 we see that АТ can also be obtained by "reflecting" А about 


its main diagonal. 


(12) 


1—2 4 1 3 —5 
А=|3 7 0 -2 7 8 
—-5 8 6 4 0 6 
Interchange entries that are 
symmetrically positioned 
about the main diagonal. 
DEFINITION 8 


If A is a square matrix, then the trace of A, denoted by tr(A), is defined to be the sum of the entries on the 
main diagonal of A. The trace of A is undefined if A is not a square matrix. 


James Sylvester (1814—1897) 


Arthur Cayley (1821—1895) 


Historical Note The term matrix was first used by the English mathematician (and lawyer) James Sylvester, 
who defined the term in 1850 to be an “oblong arrangement of terms.” Sylvester communicated his work on 
matrices to a fellow English mathematician and lawyer named Arthur Cayley, who then introduced some of 
the basic operations on matrices in a book entitled Memoir on the Theory of Matrices that was published in 
1858. As a matter of interest, Sylvester, who was Jewish, did not get his college degree because he refused to 
sign a required oath to the Church of England. He was appointed to a chair at the University of Virginia in the 
United States but resigned after swatting a student with a stick because he was reading a newspaper in class. 


Sylvester, thinking he had killed the student, fled back to England on the first available ship. Fortunately, the 
student was not dead, just in shock! 
[/mages: The Granger Collection, New York] 


EXAMPLE 11 Trace of a Matrix < 


The following are examples of matrices and their traces. 


Qi, 412 413 A E. 
А= |421 an 8023], B= i > E р 
а а а ~ 
31 432 433 ас T 0 


r(A) =а1 tantaz (B) = —1454-7--0—11 


In the exercises you will have some practice working with the transpose and trace operations. 


Concept Review 
Matrix 
Entries 


Column vector (or column matrix) 


Row vector (or row matrix) 


Square matrix 


Main diagonal 


Equal matrices 


Matrix operations: sum, difference, scalar multiplication 


Linear combination of matrices 


Product of matrices (matrix multiplication) 


Partitioned matrices 


Submatrices 


Row-column method 


Column method 


Row method 


Coefficient matrix of a linear system 


Transpose 


Trace 


Skills 


Determine the size of a given matrix. 


Identify the row vectors and column vectors of a given matrix. 


Perform the arithmetic operations of matrix addition, subtraction, scalar multiplication, and multiplication. 


Determine whether the product of two given matrices is defined. 


Compute matrix products using the row-column method, the column method, and the row method. 


Express the product of a matrix and a column vector as a linear combination of the columns of the matrix. 


Express a linear system as a matrix equation, and identify the coefficient matrix. 


Compute the transpose of a matrix. 


Compute the trace of a square matrix. 


Exercise Set 1.3 


1. Suppose that А, В, C, D, and E are matrices with the following sizes: 


A B C D E 
(4х5) (4х5) (5x2) (4х2) (5x4) 


In each part, determine whether the given matrix expression is defined. For those that are defined, give the size of 
the resulting matrix. 


(a) BA 

(b) ACHD 
(с) AE + B 
(d) AB +E 

(е) ECA + 8) 
(f) E(AC) 

(8) ETA 

W (47. ED 
Answer: 


(a) Undefined 
(b) 4x2 
(c) Undefined 
(d) Undefined 
(e) 2x5 
(f) 5x2 


(g) Undefined 
(h) 2х2 


2. Suppose that А, B, C, D, and E are matrices with the following sizes: 


w 


А В С D E 
(3x1) (3x6) (6x2) (2x6) (1x3) 


In each part, determine whether the given matrix expression is defined. For those that are defined, give the size of 
the resulting matrix. 


(a) EA 

(b) дв? 

(c) 8" (4 + ЕТ) 
(d) 24-- C 

(e) (c7 + р)в? 
(0 CD-4- вТЕТ 
(g) (ap" c? 
(h) DC + £A 


. Consider the matrices 


з 0 15 2 613 
А= -1 2}, Ren E с=|; 15] D2|-10 1|, #=|-1 12 
11 324 413 


In each part, compute the given expression (where possible). 
(a) PHE 

(b D-E 

(c) 5A 

(d) —7C 

(е) 28- C 

( 4E —2D 

(в) 300+ 28) 
(h А-А 

(i) tr(D) 

(j) tr(D— 38) 
(k) 4 tr(7B) 

() tr(A) 


Answer: 


(a) 


(b) | — 


(с) [15 0 


—5 10 
3-3 
(d | =? -28 —14 
-21 -7 -35 
(e) Undefined 
( | 22 —6 8 
—2 46 
10 04 
(g) | -39 —21 —24 
9 —6 —15 
—33 —12 —30 
(p |0 0 
0 0 
0 0 
(i) 5 
GQ -2 
(k) 168 


(D) Undefined 

. Using the matrices in Exercise 3, in each part compute the given expression (where possible). 
(a) 247 C 

(0) 07-87 

(с) (p- E)? 

(d) В+ 5c 

© 5с7-1 

( B— 57 

(g) 287 -3p7 

(9 (вт зр?) * 


(i) (CD)E 
G) C(BA) 
(Ю) (Е?) 
(D tr(BC) 

. Using the matrices in Exercise 3, in each part compute the given expression (where possible). 
(a) AB 
(b) BA 
(c) GED 
(d) (AB)C 
(e) A(BC) 
(D сс? 


(в) (DAT 
(h) (c ва? 


G) р?) 
G) tr (4z7 - D) 
(k) (CPA? + 287) 


(0 «(вс") 4 


Answer: 
(a | 12 —3 
-4 5 
4 1 


(b) Undefined 
(с) |42 108 75 


12 —3 21 
36 78 63 
“М 45 9 
11 -11 17 
7 17 13 
(e) | 3 45 9 
11 -11 17 
7 17 13 
(f) [5 a 
17 35 
(g) | 0 —2 H 
12 1 8 
һ)[12_ 6 9 
48 —20 14 
24 816 
(i) 61 
(j) 35 
(k) 28 
(l) 99 


. Using the matrices in Exercise 3, in each part compute the given expression (where possible). 
(а) (257 - EJA 
(b) (4B)C + 28 
(c) (АС) + 507 


(d) (ва? x 2c} Д 


(е) BT(cc?— АТА) 

(0 Diz? — (ED)? 

. Let 

26 


3 7 6 
А=|6 4| and 8—|0 
0 9 7 


+ \л 


Use the row method or column method (as appropriate) to find 
(a) the first row of AB. 

(b) the third row of AB. 

(c) the second column of AB. 

(d) the first column of BA. 

(e) the third row of АА. 

(f) the third column of АА. 


Answer: 
(a) [67 41 41] 
(b) [63 67 57] 
(c) |41 

21 

67 
(d | 6 

6 

63 
(e) [24 56 97] 
(f) | 76 

98 

97 


. Referring to the matrices in Exercise 7, use the row method or column method (as appropriate) to find 
(a) the first column of AB. 
(b) the third column of BB. 
(c) the second row of BB. 
(d) the first column of AA. 
(e) the third column of AB. 
(f) the first row of BA. 


. Referring to the matrices A and B in Exercise 7, and Example 9, 
(a) express each column vectorof АА as a linear combination of the column vectors of A. 


(b) express each column vector of BB as a linear combination of the column vectors of B. 


Answer: 


0 4 9 97 0 4 9 
6 —2 4 38 6 —2 4 
» =—20|+| 1|+7|3|; |18|—24|0|43| 1|++5|3 
28 7 E. 2 74 7 7 5 


(а) | =3 3 —2 7 76 3 —2 7 
CERE ; = —26|+5| 5|+4|4|; |98|=7|6|+4 5|+9|4 
(b) | 64 6 4 
21 |=6/0)+7| 3); 
T] 7 5 
10. Referring to the matrices A and B in Exercise 7, and Example 9, 
(a) express each column vector of AB as a linear combination of the column vectors of A. 
(b) express each column vector of BA as a linear combination of the column vectors of B. 
11. In each part, find matrices A, x, and b that express the given system of linear equations as a single matrix equation 
Ах = h, and write out this matrix equation. 
(a) 2x1 — 3x2 + 5x3 = 7 


9x; = x2+ x3=—1 

хі 5х2 + 4х3 = 0 
(b) 4х1 —3x3+ x4=1 

5x + x2 = 8x4— 3 


2x1 = 5x3 + 9х3 — x4-—0 
3x2 = хз+7хд=2 


(b) 


1 
3 
=5 9 —1||^3 0 
3-1 7||^4 2 


12. In each part, find matrices А, x, and b that express the given system of linear equations as a single matrix equation 
Ax = h. and write out this matrix equation. 


(а) x1—2x243x3— —3 


2х\+ х2 = 0 
—3x3-4x3-— 1 

х + х3 = 5 

(b) 3x1 + 3х2 + 3х3 = – 3 
=x, = 5х9 = 283 = 3 


—4x3-- хз = 0 


13. In each part, express the matrix equation as a system of linear equations. 


ols 6-zm] [2 
-1-2 3||х2|=|0 


0 4 -1]|[*3 3 


(b) |1 1 1 H 2 
2 3 O0]j|z2|2| 2 
5 =3 —6||^3 —9 
Answer: 
(a) 5x + 6x2 — 7x3 = 2 
=x] = 2x2 + 3x3 = 0 
4x9 = x3 = 3 
(D 1 + x2 + x3 = 2 
2x, + 3x3 = 2 
5x; = 3x3 = бхз = <9 


14. In each part, express the matrix equation as a system of linear equations. 


(a) 3 —1 2|[x1 2 
4 37 H =| =] 
-2 1 5||*3 4 
(b) 3-2 0 l|rw 0 
5 02 -2||х| |0 
3 14 7||7| |0 
—2 51 6[1? 0 


In Exercises 15—16, find all values of k, if any, that satisfy the equation. 


15. 11 о 
[k 1 1]|1 0 2||1|=0 
02 —3||1 
Answer: 
—1 
16. 12 0/12 
[2 2 &112 0 3||2|=0 
0 3 1||& 


In Exercises 17-18, solve the matrix equation for a, b, c, and d. 


17.| a 33 MEL. d — 2c 
—-] a+b d--2c —2 
Answer: 
a=4, b= =, c= =], d=1 
18.| 2-52 b+aj_ |8 1 
3d+c 24—с| |7 6 
19. Let A be any jz; x у matrix and let 0 be the jy x » matrix each of whose entries is zero. Show that if ic 4 = 0, then 
k—üor4-— 0. 


20. (a) Show that if AB and BA are both defined, then АВ and BA are square matrices. 
(b) Show that if A is an jj; x з matrix and А(ВА) is defined, then B is an » x зр matrix. 


21. Prove: If А and B are » x у matrices, then tr(. A + В) = trCA) + tr(5). 
22. (a) Show that if А has a row of zeros and B is any matrix for which AB is defined, then AB also has a row of 
Zeros. 
(b) Find a similar result involving a column of zeros. 


23. In each part, find a 6 x 6 matrix [а] that satisfies the stated condition. Make your answers as general as possible 
by using letters rather than specific numbers for the nonzero entries. 


(a) aij; —0 f іж 
(b) ay =0 f iJ 
(с) aj; =0 f i<j 
(d) ay —0 ff p—j|71 


Answer: 

(а) аар 0 0 о Q 0 
0 an 0 0 0 0 
0 0 аз 0 0 Q 
0 0 ay 0 0 
0 0 0 ass 0 
0 0 0 0 ав 

(b |211 412 213 414 415 а16 


432 423 924 225 126 
Ü азз аза азу аз 
0 0 ад a45 ад 
0 0 ass asg 
0 0 0 ag 


c) |а 0 0 0 


о о о о о 


bs аа 


061 062 463 464 A65 A66 


(d |а ар 0 0 0 


o © 


0 0 аза ass ase 
0 0 0 a65 ав 


24. Find the 4 x 4 matrix A= [aj] whose entries satisfy the stated condition. 
(a) ty ci J 
(b) aj; = #7} 


© — f 1$ -j> 
^P 1-1; g-j&l 


25. Consider the function y = 7 (x) defined for 2 | matrices x by у = Ax, where 


alo 


Plot f(x) together with x in each case below. How would you describe the action of f? 


Answer: 
X) [Xp x2 
а) 


Alt 


26. Let I be the » x matrix whose entry in row i and column j is 
1 f i=j 
0 f ižj 


27. How many 3 ж 3 matrices А can you find such that 


Show that 47 = ЈА = А for every » x » matrix A. 


for all choices of x, y, and z? 


Answer: 
1 1 0 
One; namely 4= |1 —1 0 
0.0 0 


28. How many 3 x 3 matrices A can you find such that 


for all choices of x, y, and z? 


29. A matrix В is said to be a square root of a matrix A if BB = А. 


(a) Find two square roots of 4 = E | 


(b) How many different square roots can you find of A = f Ji 


(c) Do you think that every 2 x; 2 matrix has at least one square root? Explain your reasoning. 


Answer: 


(а) |1 1 4 =1 -1 
f L i= =1 
0) oa 05 01 1745 0| 105 o | |-ү5 0 
‘lo 3 п 3) |o -=a | © = 
30. Let 0 denote a 2 x 2 matrix, each of whose entries is zero. 


(a) Is there a 2 x 2 matrix A such that А 4 () and 44 = 0? Justify your answer. 
(b) Is there a 2 x 2 matrix A such that А + 0 and АА = 4? Justify your answer. 


True-False Exercises 


In parts (a)-(0) determine whether the statement is true or false, and justify your answer. 
e) The matrix l : J has no main diagonal. 


Answer: 


True 


b) An matrix has m column vectors and n row vectors. 
MXA 
Answer: 


False 


(с) IfA and B are 2 x 2 matrices, then 45 = BA. 
Answer: 


False 


(d) The i th row vector of a matrix product АВ can be computed by multiplying А by the ith row vector of B. 
Answer: 


False 


T 
(e) For every matrix A, it is true that (47) = A. 


Answer: 


True 
(f) If A and B are square matrices of the same order, then tr( AB) = tr(Ajtr(3). 


Answer: 


False 


(g) If A and B are square matrices of the same order, then (48) T — АТВТ. 


Answer: 
False 
(h) - os А түс А 
For every square matrix A, it is true that tr (А). 
Answer: 
True 
ТАТ 


(i) I£ 4 isa 6 x 4 matrix and B is an jj x у matrix such that В is a 2 x 6 matrix, then jj; = 4 andy, = 2. 


Answer: 


True 


(j) If A is an y x y matrix and c is a scalar, then tr(c.4) — c tr(A). 
Answer: 


True 


(k) If A, B, and С are matrices of the same size such that 4 — Œ = B = С, then 4 — 2. 
Answer: 


True 


(I) If A, B, and C are square matrices of the same order such that 4 — ВС, then 4 — B. 
Answer: 


False 


(m) If 45 + BA is defined, then A and В are square matrices of the same size. 
Answer: 


True 


(n) If B has a column of zeros, then so does AB if this product is defined. 
Answer: 


True 


(о) If B has a column of zeros, then so does BA if this product is defined. 
Answer: 


False 
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1.4 Inverses; Algebraic Properties of Matrices 


In this section we will discuss some of the algebraic properties of matrix operations. We will see that many of 
the basic rules of arithmetic for real numbers hold for matrices, but we will also see that some do not. 


Properties of Matrix Addition and Scalar Multiplication 


The following theorem lists the basic algebraic properties of the matrix operations. 


THEOREM 1.4.1 Properties of Matrix Arithmetic 


Assuming that the sizes of the matrices are such that the indicated operations can be performed, the 
following rules of matrix arithmetic are valid. 

(а) A+B=8+A (Commutative law for addition) 

(b A (8+6) = (А+ B) +С (Associative law for addition) 
(с) АВС) = (AB)C (Associative law for multiplication) 

(d) AB + C) = AB + AC (Left distributive law) 

(е) В+ С)А= ВА + СА (Right distributive law) 

(p AB- С) = АВ – АС 

(е) (8 = С)А= BA- СА 

(n) alB +C) —-aB-FaC 

() a(8—C)-—aB-—aC 

G) (a4 5)C —aC-FBC 

(k) lab) =a —bC 

() a(bC)-—(ab)C 

(m) a(BC) = (aB)C = B(aC) 


To prove any of the equalities in this theorem we must show that the matrix on the left side has the same size 
as that on the right and that the corresponding entries on the two sides are the same. Most of the proofs follow 
the same pattern, so we will prove part (d) as a sample. The proof of the associative law for multiplication is 
more complicated than the rest and is outlined in the exercises. 


There are three basic ways to prove that two 
matrices of the same size are equal—prove that 
corresponding entries are the same, prove that 
corresponding row vectors are the same, or 
prove that corresponding column vectors are 
the same. 


Proof (d) We must show that A(3 + C) and AB + AC have the same size and that corresponding entries 
are equal. To form A(S + C), the matrices В and C must have the same size, say и x д, and the matrix А 
must then have m columns, so its size must be of the form к. This makes A(B + С) an r x » matrix. It 
follows that AB + AC is also an r x у matrix and, consequently, 402 + C) and AB + AC have the same size. 


Suppose that А = [a;j], 2 = [4;;],and C = [cij]. We want to show that corresponding entries of 
A(B + C) and AB + АС are equal; that is, 
[AB + C] = [48 + AC] y 
for all values of i and j. But from the definitions of matrix addition and matrix multiplication, we have 
[4(8 + Су] = aj 1 ^ ci) T aj7(b3; + с2;) deret im (Pm; + Сту) 
(121; ! ajb» Ра атту) + (ajcij Бас) + coo + AimC mj) 
[AB] ij [AC] = [AB + АС] ij 


Remark Although the operations of matrix addition and matrix multiplication were defined for pairs of 
matrices, associative laws (b) and (c) enable us to denote sums and products of three matrices as А + B + C 
and ABC without inserting any parentheses. This is justified by the fact that no matter how parentheses are 
inserted, the associative laws guarantee that the same end result will be obtained. In general, given any sum or 
any product of matrices, pairs of parentheses can be inserted or deleted anywhere within the expression 
without affecting the end result. 


EXAMPLE 1 Associativity of Matrix Multiplication + 


As an illustration of the associative law for matrix multiplication, consider 


12 
A-|3 4|. з=; 1 c=]; | 


0 1 2 1 2 3 
Then 
1 2 8 5 
4 3 4 31/1 0 10 9 
AB=|3 4| |- 20 13| and sc-| | [+| | 
0 1 2 1 m 2 112 3 4 3 
Thus 
8 5 10 18 15 
(ABC = | 20 13 Р |= 46 39 
2 1 4 3 
and 
12 18 15 
A(BC)-—|3 4 b 3|- 46 39 
0 1 4 3 


so (AB)C = A(BC), as guaranteed by Theorem 1.4.1(с). 


Properties of Matrix Multiplication 


Do not let Theorem 1.4.1 lull you into believing that a// laws of real arithmetic carry over to matrix 
arithmetic. For example, you know that in real arithmetic it is always true that gb = dg, which is called the 
commutative law for multiplication. In matrix arithmetic, however, the equality of AB and BA can fail for 
three possible reasons: 


1. AB may be defined and BA may not (for example, if A is 2 x 3 and B is 3 x 4). 

2. AB and BA may both be defined, but they may have different sizes (for example, if A is 2 x 3 and B is 
3 x 2). 

3. AB and BA may both be defined and have the same size, but the two matrices may be different (as 
illustrated in the next example). 


Do not read too much into Example 2—1 does 
not rule out the possibility that 4B and BA may 
be equal in certain cases, just that they are not 
equal in all cases. If it so happens that 

AB = BA, then we say that AB and BA 
commute. 


EXAMPLE 2 Order Matters in Matrix Multiplication + 


Consider the matrices 
Multiplying gives 


Thus, AB + BA. 


Zero Matrices 


A matrix whose entries are all zero is called a zero matrix. Some examples are 
l i | x f 0 0 { 
0 0 000 0000 


We will denote a zero matrix by 0 unless it is important to specify its size, in which case we will denote the 
m x n Zero matrix by бу. 


o 


о о о 


It should be evident that if 4 and 0 are matrices with the same size, then 

A+0=04+A=A 
Thus, 0 play s the same role in this matrix equation that the number 0 plays in the numerical equation 
a+0=0+a=a. 


The following theorem lists the basic properties of zero matrices. Since the results should be self-evident, we 
will omit the formal proofs. 


THEOREM 1.4.2 Properties of Zero Matrices 


If c is a scalar, and if the sizes of the matrices are such that the operations can be perfomed, then: 


(а) А+0=0+А=А 


(b A-O=A 
(c) А-А=А+(—-А)=0 
(d) 04-0 


(е) Ifc4— 0, then с = 0 or A = 0. 


Since we know that the commutative law of real arithmetic is not valid in matrix arithmetic, it should not be 
surprising that there are other rules that fail as well. For example, consider the following two laws of real 
arithmetic: 


* If gh = bc and g <0, then 5 — с. [The cancellation law] 
* If gb = 0, then at least one of the factors on the left is 0. 


The next two examples show that these laws are not universally true in matrix arithmetic. 


EXAMPLE 3 Failure of the Cancellation Law + 


a= ah ве ар [4 


We leave it for you to confirm that 
AB = AC = [ 8) 


6 8 


Although 4 + 0, canceling A from both sides of the equation 43 — AC would lead to the 
incorrect conclusion that 8 — СЇ. Thus, the cancellation law does not hold, in general, for matrix 
multiplication. 


Consider the matrices 


EXAMPLE 4 A Zero Product with Nonzero Factors + 


Here аге two matrices for which 45 — 0, but А + дара Bs Q: 


“е 


Identity Matrices 


A square matrix with 1's on the main diagonal and zeros elsewhere is called an identity matrix. Some 
examples are 


1000 
[1] 1 0 P. 0100 
(91, Jog gf ЕЕЕ 
0001 


An identity matrix is denoted by the letter 7. If it is important to emphasize the size, we will write /, for the 
x x н identity matrix. 


To explain the role of identity matrices in matrix arithmetic, let us consider the effect of multiplying a general 
2 x 3 matrix A on each side by an identity matrix. Multiplying on the right by the 3 х 3 identity matrix yields 


100 
а 412 413 а ар 413 
а= s a2) | н : : = [s a2) 23] 4 


and multiplying on the left by the 2 2 identity matrix yields 
1 0 [|411 412 213 211 @12 413 
à là ape a2) sl E an aa 
The same result holds in general; that is, if A is any рр x; »; matrix, then 


Al,=A and ImA=A 


Thus, the identity matrices play the same role in these matrix equations that the number | plays in the 
numerical equation g · 1 = 1 -a — а. 


As the next theorem shows, identity matrices arise naturally in studying reduced row echelon forms of square 
matrices. 


THEOREM 1.4.3 


If R is the reduced row echelon form of an » x у matrix A, then either R has a row of zeros or R is the 
identity matrix J). 


Proof Suppose that the reduced row echelon form of A is 


rii 712 "^"^" Fin 

ra ccr 
[m 21 22 = 

Pap Кыз Coo Рум 


Either the last row in this matrix consists entirely of zeros or it does not. If not, the matrix contains no zero 
rows, and consequently each of the n rows has a leading entry of 1. Since these leading 1's occur 
progressively farther to the right as we move down the matrix, each of these 1's must occur on the main 
diagonal. Since the other entries in the same column as one of these 1's are zero, R must be /,. Thus, either R 
has a row of zeros or R= Jy. 


Inverse of a Matrix 


In real arithmetic every nonzero number a has a reciprocal a = = 1 / a) with the property 


1 


а-а” =a ‘a=1 


The number g7! is sometimes called the multiplicative inverse of a. Our next objective is to develop an 


analog of this result for matrix arithmetic. For this purpose we make the following definition. 


DEFINITION 1 


If A is a square matrix, and if a matrix B of the same size can be found such that 48 = BA = 7, then A 
is said to be invertible (or nonsingular) and B is called an inverse of A. If no such matrix В can be 
found, then A is said to be singular. 


Remark The relationship 43 = BA = Z is not changed by interchanging А and B, so if A is invertible and B 
is an inverse of A, then it is also true that В is invertible, and A is an inverse of B. Thus, when 


AB = BA-I 


we say that А and B are inverses of one another. 


EXAMPLE 5 Anilnvertible Matrix + 


Let 


Then 


е = 12 T 
aap 


Thus, А and B are invertible and each is an inverse of the other. 


EXAMPLE 6 Class of Singular Matrices + 


In general, a square matrix with a row or column of zeros is singular. To help understand why 


this is so, consider the matrix 


1 
А=|2 


0 
0 
3 6 0 


To prove that А is singular we must show that there is no 3 x 3 matrix B such that 45 = BA =I 
. For this purpose let c1, €2, 0 be the column vectors of A. Thus, for any 3 x 3 matrix B we 


can express the product BA as 


BA=B[e, c; 0] = [8с Bez 0] [ Formula (6) of Section 1.3] 


The column of zeros shows that 3.4 = 7 and hence that A is singular. 


Properties of Inverses 


It is reasonable to ask whether an invertible matrix can have more than one inverse. The next theorem shows 


that the answer is no—an invertible matrix has exactly one inverse. 


THEOREM 1.4.4 


If B and C are both inverses of the matrix A, then B = C. 


Proof Since В is an inverse of A, we have дА — 7. Multiplying both sides on the right by C gives 
(BA)C = JC = C. But it is also true that (BA) C = B( AC) = BJ = В, so C = B. 


As a consequence of this important result, we can now speak of "the" inverse of an invertible matrix. If A is 


invertible, then its inverse will be denoted by the symbol 4471. Thus, 


AA =] and A A-1 


(1) 


The inverse of A plays much the same role in matrix arithmetic that the reciprocal 4 ^! plays in the numerical 
relationships да 1 — 1 and 4714 = 1. 


In the next section we will develop a method for computing the inverse of an invertible matrix of any size. 
For now we give the following theorem that specifies conditions under which a 2 x 2 matrix is invertible and 
provides a simple formula for its inverse. 


THEOREM 1.4.5 


The matrix 


Ls 


is invertible if and only if gg — bec # 0, in which case the inverse is given by the formula 


-1 _ 1 d =b 
РР Q) 


We will omit the proof, because we will study a more general version of this theorem later. For now, you 
should at least confirm the validity of Formula 2 by showing that 447! = 4^1 4 =F. 


Historical Note The formula for 4-1 given in Theorem 1.4.5 first appeared (in a more general 


form) in Arthur Cayley's 1858 Memoir on the Theory of Matrices. The more general result that 
Cayley discovered will be studied later. 


The quantity gd — bc in Theorem 1.4.5 is 
called the determinant of the 2 x 2 matrix A 
and is denoted by 

det(.4) = ad = bc 
or alternatively by 


a b 
c d 


[=a -èe 


Remark Figure 1.4.1 illustrates that the determinant of a 2 x 2 matrix A is the product of the entries on its 
main diagonal minus the product of the entries off its main diagonal. In words, Theorem 1.4.5 states that a 

2 x 2 matrix A is invertible if and only if its determinant is nonzero, and if invertible, then its inverse can be 
obtained by interchanging its diagonal entries, reversing the signs of its off-diagonal entries, and multiplying 
the entries by the reciprocal of the determinant of A. 


| № Bi 
det(A) = E. | = аа — bc 
c ^d 


ч 


Figure 1.4.1 


EXAMPLE 7 Calculating the Inverse of a 2 x 2 Matrix — 


In each part, determine whether the matrix is invertible. If so, find its inverse. 


(a) , |6 1 
4-|s 3 


& , [-1 2 
ШЕР 


Solution 


(a) The determinant of A is det(.4) = (6)(2) — (1) (5) = 7, which is nonzero. Thus, А is 
invertible, and its inverse 15 


SA 01 


ts 
L 
| 
а [= 
roaa 
| 
мл го 
| 
Ae 
LL 
| 
јл |ә 


We leave it for you to confirm that 4471 = 4-1 4 — [. 
(b) The matrix is not invertible since det(.4) = (—1)(—6) — (2) (3) = 0. 


EXAMPLE 8 Solution of a Linear System by Matrix Inversion + 


A problem that arises in many applications is to solve a pair of equations of the form 

u-—ax-by 

v=cx+dy 
for x and у in terms of и and у. One approach is to treat this as a linear system of two equations in the 
unknowns x and y and use Gauss—Jordan elimination to solve for x and y. However, because the 


coefficients of the unknowns аге literal rather than numerical, this procedure is a little clumsy. As an 
alternative approach, let us replace the two equations by the single matrix equation 


А-8 
ЕНИН 


If we assume that ће 2 x 2 matrix is invertible (1.е., gg — bc # 0), then we can multiply through on 
the left by the inverse and rewrite the equation as 


which we can rewrite as 


which simplifies to 
a Ь] [и [x 
c d v| |у 
Using Theorem 1.4.5, we can rewrite this equation as 


зг (4 40] 


X 


from which we obtain 


The next theorem is concerned with inverses of matrix products. 


THEOREM 1.4.6 


If A and B are invertible matrices with the same size, then AB is invertible and 


(АВ) — вА" 


Proof We сап establish the invertibility and obtain the stated formula at the same time by showing that 


(AB) 871471) = (8714 сав) ES 
But 
(AB) 871471) = A[ps TAT = АА — AA — 1 


and similarly, (2 cup )(4B) =f, 


Although we will not prove it, this result can be extended to three or more factors: 


A product of any number ofinvertible matrices is invertible, and the inverse of the product is the 
product of the inverses in the reverse order. 


EXAMPLE 9 The Inverse of a Product <4 


Consider the matrices 


We leave it for you to show that 


4 —3 
# 6 = 
4-[; 4 (йб =|_9 7 
2 2 
and also that 
1—1 —1 4 —3 
Е 2 = ое И ЕЕ 


Thus, (45) -1 — 87! 4-1 as guaranteed by Theorem 1.4.6. 


Powers of a Matrix 


If A is a square matrix, then we define the nonnegative integer powers of A to be 
A° =] and A"—4A4--.-.4A [z factors] 
and if A is invertible, then we define the negative integer powers of А to be 


n 


A” = (47) —4141...44 [z factors] 


Because these definitions parallel those for real numbers, the usual laws of nonnegative exponents hold; for 


example, 


A A5 = Arts and (4 y =A" 


If a product of matrices is singular, then at least 
one of the factors must be singular. Why? 


In addition, we have the following properties of negative exponents. 
THEOREM 1.4.7 


If A is invertible and n is a nonnegative integer, then: 


-l 
(а) AT! is invertible and (4 a = A. 


(b) 4" is invertible and (any = A” = (4 ч E 


(c) kA is invertible for any nonzero scalar k, and (kA) 12714-1. 


We will prove part (с) and leave the proofs of parts (a) and (b) as exercises. 


Proof (c) Properties (c) and (m) in Theorem 1.4.1 imply that 


(kA) (a7) =k! (kA) A! = (eaa =i] 


and similarly, (t 341) = (KA) = I. Thus, КА is invertible and (4.4) 12714-1. 


EXAMPLE 10 Properties of Exponents + 


Let А and A be the matrices in Example 9; that is, 
| [12 - И 3 -2 
а= || | ane -| 1 
E 112 3 -2 3 —2 3 —2 4] —30 
At={at) = = 
| | Е 1E ТЕ \ К: t 
Pe 1211 2]/1 2 » 11 30 
1311131113 15 41 


so, as expected from Theorem 1.4.7(b), 


а 1 41 —30]_[ 41 —30]_/,-1\3 
М") = тату 005 E: |= |-15 "EC | 


Then 


Also, 


EXAMPLE 11 The Square of a Matrix Sum << 


In real arithmetic, where we have a commutative law for multiplication, we can write 
(a+b)? =a? + ab + ba -- b? =а? + ab + ab +b? =a? + 2ab +b? 


However, in matrix arithmetic, where we have no commutative law for multiplication, the best 
we can do is to write 


(A+B)? = A? + AB + BA 4- В? 


It is only in the special case where А and B commute (1.е., АВ = BA) that we can go a step 
further and write 


(A+ B)? = 4? + 2AB + В? 


Matrix Polynomials 


If A is a square matrix, say » x », and if 
р(х) ag aix Hax + «pag 


is any polynomial, then we define the › x x matrix p(A) to be 
Р(А) —ag | ajA + а3А2 4 e.. amA” 6) 


where J is the x x »; identity matrix; that is, p(A) is obtained by substituting A for x and replacing the constant 
term 20 by the matrix @g/. An expression of form 3 is called a matrix polynomial in A. 


EXAMPLE 12 AMatrix Polynomial + 
Find p(A) for 


p(x)-x?-2x-3 and а= |7) J 


Solution 
р(Ау = A^-2A-3I 


or more briefly, p(.4) = 0. 


Remark It follows from the fact that 4" 45 = 4" *5 — 45t” — 45 4" that powers of a square matrix 
commute, and since a matrix polynomial in А is built up from powers of А, any two matrix polynomials in А 
also commute; that is, for any polynomials p; and p2 we have 


P1(A)p2(A) = p2(A4) pi CA) (4) 


Properties of the Transpose 


The following theorem lists the main properties of the transpose. 


THEOREM 1.4.8 


If the sizes of the matrices are such that the stated operations can be performed, then: 
@ (ат) E 

(b) (А+ B)T = AT + В? 

(с) (A- 2) = АТ ВТ 

(d) (kA)? = КАТ 

(e) (АВ) = ВТАТ 


If you keep in mind that transposing a matrix interchanges its rows and columns, then you should have little 
trouble visualizing the results in parts (a)-(d). For example, part (a) states the obvious fact that interchanging 
rows and columns twice leaves a matrix unchanged; and part (5) states that adding two matrices and then 
interchanging the rows and columns produces the same result as interchanging the rows and columns before 
adding. We will omit the formal proofs. Part (e) is a less obvious, but for brevity we will omit its proof as 
well. The result in that part can be extended to three or more factors and restated as: 


The transpose of a product of any number of matrices is the product of the transposes in the reverse 
order. 


The following theorem establishes a relationship between the inverse of a matrix and the inverse of its 
transpose. 


THEOREM 1.4.9 


If A is an invertible matrix, then AT is also invertible and 


y= (at) 


Proof We can establish the invertibility and obtain the formula at the same time by showing that 


4T(47) = (4?) 47-1 


But from part (e) of Theorem 1.4.8 and the fact that 7 — 7, we have 


which completes the proof. 


EXAMPLE 13 Inverse of a Transpose <4 


Consider a general 2 x 2 invertible matrix and its transpose: 
a b T [a € 
A= and 4° = 
| z |; d | 
Since A is invertible, its determinant gg — bc is nonzero. But the determinant of A’ is also 


ad — be (verify), so A’ is also invertible. It follows from Theorem 1.4.5 that 


LH o 

= d = bc ad — bc 
и) =“ 

__Ф—__ __а__ 

— ad —bc ad — bc 


which is the same matrix that results if 4 71 is transposed (verify). Thus, 


wy -ey 


as guaranteed by Theorem 1.4.9. 


Concept Review 


Commutative law for matrix addition 


Associative law for matrix addition 


Associative law for matrix multiplication 


Left and right distributive laws 


Zero matrix 


Identity matrix 


Inverse of a matrix 


Invertible matrix 


Nonsingular matrix 


Singular matrix 


Determinant 


Power of a matrix 


* Matrix polynomial 


Skills 


* Know the arithmetic properties of matrix operations. 


Be able to prove arithmetic properties of matrices. 


Know the properties of zero matrices. 


Know the properties of identity matrices. 


Be able to recognize when two square matrices are inverses of each other. 


Be able to determine whether a 2 x 2 matrix is invertible. 


Be able to solve a linear system of two equations in two unknowns whose coefficient matrix is 
invertible. 


Be able to prove basic properties involving invertible matrices. 


Know the properties of the matrix transpose and its relationship with invertible matrices. 


Exercise Set 1.4 


1. Let 
2 =] 3 8 =3 —5 0 -2 3 
A= 0 4 5 B—|0 1 2 C= 7 4 a=4, b= = 
-2 14 4 =? 6 3 5 9 
Show that 


(а) 4+ (8+6) = (4+8) +С 
(b) (48)С = A(BC) 
(c) (a +2) С=ас+2с 
(d) 4(08- C) 2aB —aC 
2. Using the matrices and scalars in Exercise 1, verify that 
(a) a(BC) = (aB)C = B(aC) 
(b) A(B—C)-—AB-— AC 
(c) (В+ С)А= BA-- СА 
(d) a(bC) = (ab)C 


3. Using the matrices and scalars in Exercise 1, verify that 
ДА n d 
(b) (A+B)? — AT +87 
(c) (aC)? 2 ac? 
(d) (AB) = ВТА? 


In Exercises 4—7 use Theorem 1.4.5 to compute the inverses of the following matrices. 


Answer: 


po = 


о ml 


8. Find the inverse of 


9. Find the inverse of 


—l 
10. Use the matrix А in Exercise 4 to verify that | А | = 


п 


cos B 
=sin 6 


(e" +e") 


е-е) 


-i 
* Use the matrix В in Exercise 5 to verify that (2 7) = 


sin B 
cos B 


(e —e7) 


hl pole 


ет) 


ey 
(p). 


12. Use the matrices А and B in 4 and 5 to verify that (45) dpa. 


13. Use the matrices А, B, and C in Exercises 4—6 to verify that (ABC) СВАТ, 


In Exercises 14—17, use the given information to find А. 


14. ,-1 2 —1 
AT! = 


15. aa =|! 4 


1 -2 
Answer 
2 
= 1 
7 
А= 
1. 
7 7 
—1 1, = 
кш | 


Answer 
ET EN 
13 13 
2 Q5 
13 13 


18. Let A be the matrix 


In each part, compute the given quantity. 
(a) А? 

(b) 42 

(c) 4* 244-1 

(d) pCA), where p(x) =x —2 

(e) pCA), where р(х) = 2x? -x 4-1 
(D p(A), where p(x) = x3 — 2x +4 


19. Repeat Exercise 18 for the matrix 
$-1 
A= 
HI 


Answer: 


20. 


2 


— 


( [39 13 
26 13 


Repeat Exercise 18 for the matrix 


. Repeat Exercise 18 for the matrix 


0 0 


0.026 0.018 
—0.018 0.026 


0 0 

—5 —12 

—5 

0 0 
-3 3 
-3 —3 

16 0 0 

0 —14 —15 

0 15 —14 


oor © о 
pt 
M 


[25 0 0 
0 32 —24 
024 32 


In Exercises 22—24, let p; (x) = х2 9. p3(x) = х + 3, and рз(х) = x — 3. Show that 
р1(А) = рэ(А)рз( A) for the given matrix. 


22. The matrix A in Exercise 18. 
23. The matrix A in Exercise 21. 
24. An arbitrary square matrix A. 


25. Show that if p(x) = x^ — (a + d)x + (ad — bc) and 


- 


then p(A) = 0. 
26. Show that if p(x) = x^ — (a +b -- c)x^ + (ab -- ae + be —cd)x — a(be — сй) and 
a 00 
А=|0 de 
0 d e 
then p( A) = 0. 
27. Consider the matrix 
aj; 0 0 
Ж 0 а2 0 
0 0 ауу 
where 211422 * * * ag # 0. Show that A is invertible and find its inverse. 
Answer: 
x. ou " 
&1 
o сз 0 
422 
0 0 | 


ayy 
28. Show that if a square matrix A satisfies A? —34 4-7 — 0, then A71 — 3; А. 
29. (a) Show that a matrix with a row of zeros cannot have an inverse. 
(b) Show that a matrix with a column of zeros cannot have an inverse. 


30. Assuming that all matrices аге » x з; and invertible, solve for D. 


АВСТРВАТС = ABT 


31. Assuming that all matrices аге » x з; and invertible, solve for D. 
Cla ade pA aC ac" 
Answer: 
—l 
DECA 1BAgc? (27) А? 
32. : А А es Lo "e niT [f,TA! ; 
If A is a square matrix and n is a positive integer, is it true that (4 ) = | A” | ? Justify your answer. 
33. Simplify: 
-i 
(AB) (ac^ | (pc) p 
Answer: 
B -i 
34. Simplify: 
Zi dae et Е gal 
(4c ) (ac Jc | AD 
In Exercises 35—37, determine whether А is invertible, and if so, find the inverse. [Hint: Solve АХ = 7 for X 
by equating corresponding entries on the two sides.] 


35. 101 
А=|1 10 
011 


Answer: 
J- uo or 
2 2 2 
-i a ue. 
кыре ME 2 
Шу 4L 
2 2 2 
36. 1 1 1 
А=|1 0 0 
0 1 1 
37. 00 1 
А= Ld 
—1 1 1 


38. 


о mM|-2 m|- 
о NH m|- 


Prove Theorem 1.4.2. 


In Exercises 39—42, use the method of Example 8 to find the unique solution of the given linear system. 


39. 


40. 


41. 


42. 


43. 
44. 
45. 
46. 
47. 
48. 
49. 
50. 
51. 


52. 
53. 


3x, —2x2— —1 
4х\++5х2) = 3 


Answer: 

id ыш 
=x, +5x2=4 
=x] = 3х) = 1 
бху + x22 0 


4x; = 3x7 = —2 


Answer: 
NONE ee T 5 
E en eal 
2x, —2x3—4 
xy +4x2=4 
Prove part (a) of Theorem 1.4.1. 


Prove part (c) of Theorem 1.4.1. 

Prove part (f) of Theorem 1.4.1. 

Prove part (b) of Theorem 1.4.2. 

Prove part (c) of Theorem 1.4.2. 

Verify Formula 4 in the text by a direct calculation. 

Prove part (d) of Theorem 1.4.8. 

Prove part (e) of Theorem 1.4.8. 

(a) Show that if A is invertible and 48 — АС, then B — C. 


(b) Explain why part (a) and Example 3 do not contradict one another. 
Show that if A is invertible and k is any nonzero scalar, then (4) " = k” A" for all integer values of n. 
(a) Show that if A, B, and А + P are invertible matrices with the same size, then 


afa“ | Ba EB) 1-1 


(b) What does the result in part (a) tell you about the matrix 41 + p 71? 


54. A square matrix А is said to be idempotent if А? — А. 
(a) Show that if А is idempotent, then so is 7 — А. 


(b) Show that if 4 1s idempotent, then 2 4 — 7 1s invertible and is its own inverse. 


55. Show that if А is a square matrix such that 4* — () for some positive integer k, then the matrix А is 
invertible and 


(1—4)7 214444 TEET T 
True-False Exercises 
In parts (a)-(k) determine whether the statement is true or false, and justify your answer. 
(a) Two » x х matrices, А and B, are inverses of one another if and only if 48 = BA = 0). 
Answer: 


False 


(b) For all square matrices A and B of the same size, it is true that (4 +4 В)? = А? + 2AB + B°. 


Answer: 


False 


(c) For all square matrices А and В of the same size, it is true that А? – В? = (A= B(A + В). 


Answer: 


False 


(d) If A and B are invertible matrices of the same size, then AB is invertible and (45) daaAdg. 


Answer: 


False 


(e) If A and В are matrices such that AB is defined, then it is true that (45) T — АТВЇ. 


Answer: 


False 
(f) The matrix 


is invertible if and only if qd = be £ 0. 
Answer: 


True 


(g) If A and B are matrices of the same size and k is a constant, then (4A + 3) T —kAT ВГ. 


Answer: 


True 


(h) If А is an invertible matrix, then so is 47. 
Answer: 


True 


i) If pix) = ant aiz ахі + a, x" and Lis an identity matrix, then 
р(х) —ag-- aix +азх 4 P 


p(I)-ag-a,--a2-- * * * cáp. 
Answer: 


False 


(j) A square matrix containing a row or column of zeros cannot be invertible. 
Answer: 


True 


(k) The sum of two invertible matrices of the same size must be invertible. 
Answer: 


False 
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1.5 Elementary Matrices and a Method for Finding 
AU 


In this section we will develop an algorithm for finding the inverse of a matrix, and we will discuss some of the 
basic properties of invertible matrices. 


In Section 1.1 we defined three elementary row operations on a matrix А: 
1. Multiply a row by a nonzero constant c. 

2. Interchange two rows. 

3. Adda constant c times one row to another. 


It should be evident that if we let B be the matrix that results from A by performing one of the operations in this 
list, then the matrix A can be recovered from B by performing the corresponding operation in the following list: 


1. Multiply the same row by 1/с. 
2. Interchange the same two rows. 
3. If B resulted by adding c times row г] of A to row r», then add —c times г] to r2. 


It follows that if B is obtained from A by performing a sequence of elementary row operations, then there is a 
second sequence of elementary row operations, which when applied to B recovers А (Exercise 43). Accordingly, 
we make the following definition. 


DEFINITION 1 


Matrices А and В are said to be row equivalent if either (hence each) can be obtained from the other by 
a sequence of elementary row operations. 


Our next goal is to show how matrix multiplication can be used to carry out an elementary row operation. 


DEFINITION 2 


An y x д matrix is called an elementary matrix if it can be obtained from the » x » identity matrix Fy 
by performing a single elementary row operation. 


EXAMPLE 1 Elementary Matrices and Row Operations — 


Listed below are four elementary matrices and the operations that produce them. 


Multiply the Interchange the Add 3tmesthe Multiply the 


second row of second and fourth third row of first row of 
1 by – 3. rows of 74. Їз to the first row. 74by 1. 


The following theorem, whose proof is left as an exercises, shows that when a matrix A is multiplied on the /ef? 
by an elementary matrix Е, the effect is to perform an elementary row operation on А. 


THEOREM 1.5.1. Row Operations by Matrix Multiplication 


If the elementary matrix E results from performing a certain row operation on Im and if A is an jj x x 
matrix, then the product EA 1s the matrix that results when this same row operation is performed on A. 


EXAMPLE 2 Using Elementary Matrices — 


Consider the matrix 


and consider the elementary matrix 
100 
E-|010 
3 Q 1 


which results from adding 3 times the first row of /3 to the third row. The product EA is 


1 0 23 
EA-|2-1 3 6 
4 4 10 9 
which is precisely the same matrix that results when we add 3 times the first row of A to the third 


TOW. 


Theorem 1.5.1 will be a useful tool for 
developing new results about matrices, 
but as a practical matter it is usually 
preferable to perform row operations 
directly. 


We know from the discussion at the beginning of this section that if E 1s an elementary matrix that results from 
performing an elementary row operation on an identity matrix /, then there is a second elementary row 
operation, which when applied to E, produces / back again. Table 1 lists these operations. The operations on the 
right side of the table are called the inverse operations of the corresponding operations on the left. 


Table 1 


Row Operation on J That Produces Б Row Operation on E That Reproduces J 
Multiply row i by ¢ #0 Multiply row i by 1/с 
Interchange rows i and j Interchange rows i and j 


Add c times row i to row j 


Add —c times row i to row j 


EXAMPLE 3 Row Operations and Inverse Row Operations + 


In each of the following, an elementary row operation is applied to the 2 x 2 identity matrix to 
obtain an elementary matrix E, then E is restored to the identity matrix by applying the inverse row 


operation. 
1 0 1 0 1 0 
0 1 0 7 0 1 


t I 
Multiply the second Multiply the second 
row by 7. 


row by i 


DH bo] BH 


oi 


The next theorem is a key result about invertibility of elementary matrices. It will be a building block for many 


results that follow. 


I 
Interchange the first 


and second rows. 


1 
Add 5 times the 
second row to the 
first. 


" 
Interchange the first 


and second rows. 


1 
Add —5 times the 
second row to the 
first. 


oi 


ТНЕОКЕМ 1.5.2 


Every elementary matrix is invertible, and the inverse is also an elementary matrix. 


Proof If E is an elementary matrix, then E results by performing some row operation on 7. Let Eg be the 
matrix that results when the inverse of this operation 1s performed on /. Applying Theorem 1.5.1 and using the 
fact that inverse row operations cancel the effect of each other, it follows that 


ЕЁ = 1 and EE =] 


Thus, the elementary matrix Ey is the inverse of E. 


Equivalence Theorem 


One of our objectives as we progress through this text is to show how seemingly diverse ideas in linear algebra 
are related. The following theorem, which relates results we have obtained about invertibility of matrices, 
homogeneous linear systems, reduced row echelon forms, and elementary matrices, is our first step in that 
direction. As we study new topics, more statements will be added to this theorem. 


THEOREM 1.5.3 Equivalent Statements 


If A is an y x 4 matrix, then the following statements are equivalent, that 15, all true or all false. 
(a) Ais invertible. 

(b) Ах — 0 has only the trivial solution. 

(c) The reduced row echelon form of A is Fy. 


(d) А 15 expressible as a product of elementary matrices. 


It may make the logic of our proof of Theorem 
1.5.3 more apparent by writing the implications 


(a) = (b) = (с) = (d) = (а) 


ѓа) 
(d) (b) 


(c) 


This makes it evident visually that the validity 


of апу one statement implies the validity of all 
the others, and hence that the falsity of any one 
implies the falsity of the others. 


Proof We will prove the equivalence by establishing the chain of implications: 


(a) = (b) — (с) = (d) = (а) 
(a) = (b) Assume А is invertible and let xq be any solution of. Multiplying both sides of this equation by the 
matrix 4^1 gives 471 (Axg) = А10, or (4 Тар = D, or /xg = 0, or xg = 0. Thus, 4x = 0 has опу the 
trivial solution. 


(b) = (c) Let 4x — Q be the matrix form of the system 


@11Х1 + 412X2 +... + a 1X4, — 0 
221х1 + 222Х2 +... + азуу = 0 (1) 
dy1X1 + 4X2 +... + ayyXy = 0 
and assume that the system has only the trivial solution. If we solve by Gauss-Jordan elimination, then the 
system of equations corresponding to the reduced row echelon form of the augmented matrix will be 
х =й 
X2 =0 
; (2) 
tn =Q 
Thus the augmented matrix 
aj ау *** а б 
421 422 *** аһ 0 
а] йыз ccc am 0 


for 1 can be reduced to ће augmented matrix 


100... 0 0 
010.. 00 
001.. 0 0 
000... 1 0 


for 2 by a sequence of elementary row operations. If we disregard the last column (all zeros) in each of these 
matrices, we can conclude that the reduced row echelon form of A is žy. 


(с) =» (d) Assume that the reduced row echelon form of A is /,,, so that A can be reduced to /, by a finite 
sequence of elementary row operations. By Theorem 1.5.1, each of these operations can be accomplished by 
multiplying on the left by an appropriate elementary matrix. Thus we can find elementary matrices 

Ej, #3, .., Æp such that 


By + EEA = I, (3) 


By Theorem 1.5.2, #1, #з,...‚ Ey are invertible. Multiplying both sides of Equation 3 on the left successively 
by 8-1, #71, By! we obtain 


A-EDEM...HL-EiE ...Hd (4) 


By Theorem 1.5.2, this equation expresses А as a product of elementary matrices. 


(d) =» (a) If Ais a product of elementary matrices, then from Theorem 1.4.7 and Theorem 1.5.2, the matrix А 
is a product of invertible matrices and hence is invertible. 


A Method for Inverting Matrices 


As a first application of Theorem 1.5.3, we will develop a procedure (or algorithm) that can be used to tell 
whether a given matrix is invertible, and 1f so, produce its inverse. To derive this algorithm, assume for the 
moment, that А is an invertible » x »; matrix. In Equation 3, the elementary matrices execute a sequence of row 
operations that reduce A to 7,,. If we multiply both sides of this equation on the right by 4 1 and simplify, we 
obtain 


AT! =E: ++ FoF ily 
But this equation tells us that the same sequence of row operations that reduces A to I, will transform Iy, to A =I 


. Thus, we have established the following result. 


Inversion Algorithm 


To find the inverse of an invertible matrix A, find a sequence of elementary row operations that reduces 
A to the identity and then perform that same sequence of operations on /,, to obtain 4 1 


A simple method for carrying out this procedure is given in the following example. 


EXAMPLE 4 Using Row Operations to Find А1 4 


Find the inverse of 


© nA М 
со WwW UJ 


Solution We want to reduce A to the identity matrix by row operations and simultaneously 
apply these operations to / to produce 4-71. To accomplish this we will adjoin the identity matrix 
to the right side of A, thereby producing a partitioned matrix of the form 

[4 | 7] 
Then we will apply row operations to this matrix until the left side is reduced to 7; these 
operations will convert the right side to 471, so the final matrix will have the form 


12 3 100 
253 010 
108 0 0 1 
1 2 3 100 We added —2 times the first 
0 1 -3 -2 10 + row to the second and —1 times 
0 —2 5 =1 0 1 the first row to the third. 
de : 100 We added 2 times the 
01 -3 -210| <= d кой ard 
00 <1 2521 second row to the | 
12 3 1 0 We multiplied the third 
0 1 -3 —2 1 0| = beni 
00 1 Jef ani ddr a 
12 0 —-14 6 3 We added 3 times the third 
010 13 -5 —3 + row to the second and —3 times 
0 0 1 5 -2 —1 the third row to the first. 
dab. = 29 T We added —2 times the 
0 10 13 -5 -3| — d са 
TF 5 2 —] second row to the first. 
Thus, 
—40 16 9 
Adz| 13 —5 -3 
5 -2 =] 


Often it will not be known in advance if a given » хм matrix A is invertible. However, if it is not, then by parts 
(a) and (c) of Theorem 1.5.3 it will be impossible to reduce A to 7,, by elementary row operations. This will be 

signaled by a row of zeros appearing on the left side of the partition at some stage of the inversion algorithm. If 
this occurs, then you can stop the computations and conclude that A 15 not invertible. 


EXAMPLE 5 Showing That a Matrix Is Not Invertible + 


Consider the matrix 


1 6 
A=| 24 =] 
—l & Ss 


Applying the procedure of Example 4 yields 


16 4 100 
24 —1 01 0 
=1 2 5 0.0 1 

1 6 4 100 We added —2 times the first 
0 -8 —9 —2 1 0 + row to the second and added 
0 8 9 10 1 the first row to the third. 
1 6 4 100 We added the 
0 -8 —9 —2 10 — second row to 
0 0 0 —1 1 1 the third. 


Since we have obtained a row of zeros on the left side, A is not invertible. 


EXAMPLE 6 Analyzing Homogeneous Systems + 


Use Theorem 1.5.3 to determine whether the given homogeneous system has nontrivial solutions. 
(a) х] + 2х2 + 3х3 = 0 
2х1 + 5х2 + 3x3 = 0 
X1 + 8х3 = 0 
(b 51 + 6х2 + 4х3 = 0 
2x, 4х7 = х3 = 0 
= x1 2х2 + 5х3 = 0 


Solution From parts (а) and (b) of Theorem 1.5.3 a homogeneous linear system has only the 
trivial solution if and only if its coefficient matrix 1s invertible. From Example 4 and Example 5 
the coefficient matrix of system (a) 1s invertible and that of system (b) is not. Thus, system (a) has 
only the trivial solution whereas system (b) has nontrivial solutions. 


Concept Review 
* Row equivalent matrices 
* Elementary matrix 
* [nverse operations 


* [nversion algorithm 


Skills 

* Determine whether a given square matrix is an elementary. 

* Determine whether two square matrices are row equivalent. 

* Apply the inverse of a given elementary rwo operation to a matrix. 


* Apply elementary row operations to reduce a given square matrix to the identity matrix. 


* Understand the relationships between statements that are equivalent to the invertibility of a square 
matrix (Theorem 1.5.3). 


* Use the inversion algorithm to find the inverse of an invertible matrix. 


* Express an invertible matrix as a product of elementary matrices. 


Exercise Set 1.5 


1. Decide whether each matrix below is an elementary matrix. 


ЦЕН 


ШЕ 1 
(с)|1 1 0 
0 0 1 
00 0 
(@ 12002 
0100 
0010 
0001 
Answer: 


(a) Elementary 

(b) Not elementary 
(c) Not elementary 
(d) Not elementary 


2. Decide whether each matrix below is an elementary matrix. 


à 


(b [0 0 1 
010 
100 

(с) |1 0 0 
019 
00 1 

(d |—1 0 0 

00 1 
010 


3. Find a row operation and the corresponding elementry matrix that will restore the given elementary matrix to 


the identity matrix. 


(|-7 0 0 
0 1 0 
00 1 
(c) 100 
010 
—5 0 1 
(@ 10010 
0100 
1000 
000 1 
Answer: 
(a) Add 3 times row 2 to row 1: [ 1 
(b) -l 00 
Multipl Е 
ultiply row yrs: 010 
0 0 1 
(с) 100 
Add 5 times row 1 torow3:|] 0. 1 0 
5 0 1 
(d) 0010 
0100 
Swap rows 1 and 3: 1000 
000 1 


. Find a row operation and the corresponding elementry matrix that will restore the given elementary matrix to 
the identity matrix. 


d 1 

® lı 0 -2 0 
01 00 
00. do 
00 01 


5. In each part, an elementary matrix E and a matrix A are given. Write down the row operation corresponding 
to E and show that the product EA results from applying the row operation to A. 


aà)g [0 1] , [21-2 5 -1 
e=|( | 4-| 3 —6 —6 E 


(b) 100 2—1 0 -4 —4 
E-|0 10|, A2|1 -3-1 5 3 
0 =3 1 2 0 1 3 -1 
(c) 104 14 
#=/0 1 0|], 42|[2 5 
D 0 1 5 6 
Answer: 
(a) Swap rows | and 2: za=| 3 = к =| 
(b) 2—1 0 =4 -4 
Add —3 times row 2 to row 3: Ё4=| 1 =—3 —1 5 3 
=1 9 4 -12 -10 
(c) 13 28 
Add 4 times row 3 torow1:2A=| 2 5 
CENE 7 


6. In each part, an elementary matrix E and a matrix А are given. Write down the row operation corresponding 
to E and show that the product EA results from applying the row operation to A. 


(a) p_[-6 0] , [231-2 5 -1 
g-| 0 i} 4=| 3 —6 —6 = 


(b) 1 


In Exercises 7-8, use the following matrices. 


3 4 1 8 1 5 
А=|2 -7 -1|,8- —} —1 
& T- -5 3 4 1 
3 4 1 oe 
С=|2 -7 =1|, р=|-6 21 3 
2-27 3 3 4 1 
8.1.3 
F=|8 1 1 
341 


7. Find an elementary matrix E that satisfies the equation. 


(а) ЁА= В 

(b 4B=A 

(с) ЁА= C 

(d ЁС = А 

Answer: 

(a) [ 0 1 
010 
100 

(b) [ 0 1 
010 
100 

(c) | 100 

010 
—2 0 1 

(d|1 0 0 
010 
20 1 

8. Find an elementary matrix E that satisfies the equation. 

(a) 4B=D 

(b Ё0= В 

(с) EB =F 

(d ZF = В 


In Exercises 9—24, use the inversion algorithm to find the inverse of the given matrix, if the inverse exists. 


$4 


Answer: 


Answer: 


Answer: 


Answer: 


No inverse 


cx“ 
ч о 
"Iesse 
ec | 


oret ox cU 


__J 


© m m 


> С 
= x 


Answer: 


pe 
HIN FIN e 


|С =j lo 


T1 
Оо о rm 


su 
сша 
[ 


. 


es | 
WO Oo pr 


Oh г> 
CN C AN 


Answer: 


Answer: 


oO oO o a 


M pae lou -|2 
gs 


| 


23.| —1 0 1 0 
2 3 —2 6 
0 —1 2 0 
0 0 15 
Answer 
РЕ ОС Зо ЕЕЕ: 
12 24 8 4 
2+ XJ. E ed. 
6 12 4 2 
Эз ы эд 
12 24 8 4 
ek ЭС X 
12 24 8 4 
24.| 0 0 2 0 
1 0 0 1 
0 =1 3 0 
2 15 -3 


In Exercises 25—26, find the inverse of each of the following 4 x 4 matrices, where k1, 82, &3, k4, and k are 
all nonzero. 


25.a) [k 0 0 0 
Ok 0 0 
0 0k 0 
0 0 O0 ky 


є_ = O C 


m 

Јас 

ou 0. d 
zA 

0 0 do 

оо о 4 


bL ed 
pop 
ото 0 
dl 
U. ee 
Qe cq. d 


26.) [0 0 0 ky 


(b) 


In Exercise 27—Exercise 28, find all values of c, if any, for which the given matrix is invertible. 


27. Cre 
] e 
m E. 


Answer: 


In Exercises 29—32, write the given matrix as a product of elementary matrices. 
29.| —3 1 
22 


Answer: 


L2 ajo 2 [0 КИН 


31.]1 0 —2 
04 3 
00 1 


Answer: 


10 -2 10 -2]|100]100 
04 3|=/0 1 001535040 
00 1 00 100 100 1 
32.1110 
1 1 1 
011 


In Exercises 33—36, write the inverse of the given matrix as a product of elementary matrices. 


33. The matrix in Exercise 29. 


Answer: 
d. 
48 [10o]-lopi -0]!? 
i 3| L-1 1]| $ llo ijjo 4 
4 8 


34. The matrix in Exercise 30. 


35. The matrix in Exercise 31. 


Answer: 

0. р ку sc dp 5 
o + -3!2zlologlo: 311010 
TUUM : 00 11|00 1 
6 0- 1l [06:6 1 


36. The matrix in Exercise 32. 


In Exercises 37—38, show that the given matrices А and В are row equivalent, and find a sequence of 
elementary row operations that produces В from A. 


37. 123 1 0 5 

А=|1 41], 8=|0 2 —2 

219 1 1 4 
Answer: 


Add =] times the first row to the second row. Add — times the first row to the third row. Add =] times 
the second row to the first row. Add the second row to the third row. 


38. 21 0 6 3 4 
=| -1 1 0|, 8=|-5 -1 0 
3 0 =1 -] -2 -1 


39. Show that if 
100 
4—|0 10 
а bc 


is an elementary matrix, then at least one entry in the third row must be a zero. 


40. Show that 


MN 

| 
о о о & о 
о о 5 о 9 


о Э обо о 
зз об oo 
oO © © © 


1s not invertible for any values of the entries. 


41. Prove that if А and В are ру x у matrices, then А and B are row equivalent if and only if А and B have the 
same reduced row echelon form. 


42. Prove that if A is an invertible matrix and B is row equivalent to A, then B 15 also invertible. 


43. Show that if B is obtained from А by performing a sequence of elementary row operations, then there is a 
second sequence of elementary row operations, which when applied to B recovers A. 


True-False Exercises 

In parts (a)-(g) determine whether the statement 15 true or false, and justify your answer. 

(a) The product of two elementary matrices of the same size must be an elementary matrix. 
Answer: 


False 


(b) Every elementary matrix is invertible. 
Answer: 


True 


(c) If A and B are row equivalent, and if В and C are row equivalent, then А and C are row equivalent. 
Answer: 


True 


(d) If A is an y x 4; matrix that is not invertible, then the linear system Ах = Q has infinitely many solutions. 
Answer: 


True 


(e) If A is an » x » matrix that is not invertible, then the matrix obtained by interchanging two rows of A cannot 
be invertible. 


Answer: 


True 


(f) If A is invertible and a multiple of the first row of A is added to the second row, then the resulting matrix is 
invertible. 


Answer: 


True 


(g) An expression of the invertible matrix A as a product of elementary matrices is unique. 
Answer: 


False 
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1.6 More on Linear Systems and Invertible Matrics 


In this section we will show how the inverse of a matrix can be used to solve a linear system and we will develop some more results about 
invertible matrices. 


Number of Solutions of a Linear System 


In Section 1.1 we made the statement (based on Figures 1.1.1 and 1.1.2) that every linear system has either no solutions, has exactly one solution, 
or has infinitely many solutions. We are now in a position to prove this fundamental result. 


THEOREM 1.6.1 


A system of linear equations has zero, one, or infinitely many solutions. There are no other possibilities. 


Proof If 4x — h is a system of linear equations, exactly one of the following is true: (a) the system has no solutions, (b) the system has exactly 
one solution, or (c) the system has more than one solution. The proof will be complete if we can show that the system has infinitely many solutions 
in case (c). 


Assume that Ах = h has more than one solution, and let Xj = х] — X2, where x; and хә аге any two distinct solutions. Because x; and x2 are 
distinct, the matrix xo is nonzero; moreover, 


Axg = A(x, = x?) = Axı — 4x; = b—b=0 
If we now let k be any scalar, then 
A(x, ++ хо) = Axy + A(kxg) = Ах + &(Axg) 
=b+0=b+0=b 
But this says that x, + хо is a solution of Ах = b. Since xo is nonzero and there are infinitely many choices for k, the system 4x = h has 
infinitely many solutions. 


Solving Linear Systems by Matrix Inversion 


Thus far we have studied two procedures for solving linear systems—Gauss—Jordan elimination and Gaussian elimination. The following theorem 
provides an actual formula for the solution of a linear system of п equations in n unknowns in the case where the coefficient matrix is invertible. 


THEOREM 1.6.2 


If A is an invertible » x » matrix, then for each у x | matrix b, the system of equations 4x = h has exactly one solution, namely, x = 4^1] 


Proof Since Á|.A Ih) = b, it follows that x — 4 71] is a solution of 4x = p. To show that this is the only solution, we will assume that xo is an 
x—A b y 


arbitrary solution and then show that xo must be the solution 471p. 


If xo is any solution of 4x = h, then Axg = b. Multiplying both sides of this equation by 411, we obtain х= А 1р. 


EXAMPLE 1 Solution of a Linear System Using A! 4 


Consider the system of linear equations 
x1 2х2 + 3x32 5 
2x1 +5x9+3x3= 3 
X1 + 8x3 = 17 


In matrix form this system can be written as Ах = h, where 


123 х] 5 
4=|2 5 3|, х= |х2 |, b=| 3 
108 х3 17 
In Example 4 of the preceding section, we showed that А is invertible and 
-40 16 9 
At=| 13 -5 -3 
5 -2 -1 
By Theorem 1.6.2, the solution of the system is 
-40 16 39]|| 5 1 
x-A b-| 13 -5 -3|| 3|=|-1 
5 -2 -—1/||17 


orxy=1, x27 —1, x3—2. 


Keep in mind that the method of Example 1 only applies when the 
system has as many equations as unknowns and the coefficient 
matrix is invertible. 


Linear Systems with a Common Coefficient Matrix 


Frequently, one is concerned with solving a sequence of systems 
Ax =}, 4x—b; Ах=ЬЪз,..‚ Ах=Ь 
each of which has the same square coefficient matrix A. If A is invertible, then the solutions 
X| = Ah}, X= Ab, хз = 4 bs, a» Хк Ab; 


can be obtained with one matrix inversion and k matrix multiplications. An efficient way to do this is to form the partitioned matrix 
[аир [Бк (1) 


in which the coefficient matrix A is “augmented” by all k of the matrices b1, b2,...,b%, and then reduce 1 to reduced row echelon form by Gauss- 
Jordan elimination. In this way we can solve all К systems at once. This method has the added advantage that it applies even when А is not 
invertible. 


EXAMPLE 2 Solving Two Linear Systems at Once + 


Solve the systems 
(а) х1 +2x2+3x3=4 
2x1 5х2 + 3х3 = 5 
X1 H 8х3 = 9 
(b) х1 + 2х2 + 3х3 = 
2х1 + 5х2 + 3х3 = 6 
X1 Бхз = —6 


Solution The two systems have the same coefficient matrix. If we augment this coefficient matrix with the columns of constants on 
the right sides of these systems, we obtain 


12 3 4 1 

25 3 2 6 

1-0 8 9 —6 
Reducing this matrix to reduced row echelon form yields (verify) 

100 1 2 

010 0 1 

00 1 1 -1 


It follows from the last two columns that the solution of system (a) is x = 1, x4 = 0, x3 = 1 and the solution of system (b) is x; = 2 
x2—71lx32 -1. 


Properties of Invertible Matrices 


Up to now, to show that an у x »; matrix A is invertible, it has been necessary to find an » x у matrix B such that 
AB—I and BA—I 
The next theorem shows that if we produce an з; x з matrix B satisfying either condition, then the other condition holds automatically. 


THEOREM 1.6.3 


Let A be a square matrix. 
(a) If B is a square matrix satisfying 24 = 7, then p — AT. 
(b) If B is a square matrix satisfying 4B — 7, then p — 47!. 


We will prove part (a) and leave part (5) as an exercise. 


Proof (а) Assume that ДА = Z. If we can show that A is invertible, the proof can be completed by multiplying ДА — 7 on both sides by 4-1 to 
obtain 


BAA —1A or BI-IA! of В= 4 
To show that А is invertible, it suffices to show that the system 4x = Q has only the trivial solution (see Theorem 1.5.3). Let xo be any solution of 


this system. If we multiply both sides of Axg = 0 on the left by B, we obtain BAxg = BO or /xj = 0 or xg = 0. Thus, the system of equations 
Ах = Q has only the trivial solution. 


Equivalence Theorem 


We are now in a position to add two more statements to the four given in Theorem 1.5.3. 


THEOREM 1.6.4 Equivalent Statements 


If A is an у x ж matrix, then the following are equivalent. 
(a) Ais invertible. 

(b) Ах =Q has only the trivial solution. 

(c) The reduced row echelon form of A is Jy. 

(d) A is expressible as a product of elementary matrices. 
(e) Ax =h is consistent for every » x | matrix b. 


(f) Ах = has exactly one solution for every » x | matrix b. 


It follows from the equivalency of parts (e) and (f) that if you can 
show that 4x = h has at least one solution for every у x | matrix 
b, then you can conclude that it has exactly one solution for every 
их 1 matrix b. 


Proof Since we proved in Theorem 1.5.3 that (a), (b), (c), and (d) are equivalent, it will be sufficient to prove that (a) — (Ў) => (e) => (a). 


(а) =» (7) This was already proved in Theorem 1.6.2. 


(7) = (e) This is self-evident, for if 4x = h has exactly one solution for every » x ] matrix b, then 4x = h is consistent for every » x ] matrix b. 


(2) = (а) Ifthe system 4x = } is consistent for every » x | matrix b, then, in particular, this is so for the systems 


1 0 0 
0 1 0 
Ах=|0|, Ах=|0|,..‚ Ах=|0 
0 0 1 


Let x1, X2,...,X be solutions of the respective systems, and let us form an » x »; matrix C having these solutions as columns. Thus C has the form 
C= [xix] * * [xa] 


As discussed in Section 1.3, the successive columns of the product AC will be 
Axi, 4х2, =. Ах» 


[see Formula 8 of Section 1.3]. Thus, 


AC = [Axi Axa]: © + [Aen] = 


By part (b) of Theorem 1.6.3, it follows that Œ — 4 1. Thus, A is invertible. 


We know from earlier work that invertible matrix factors produce an invertible product. Conversely, the following theorem It shows that if the 
product of square matrices is invertible, then the factors themselves must be invertible. 


THEOREM 1.6.5 


Let А and B be square matrices of the same size. If AB is invertible, then А and B must also be invertible. 


In our later work the following fundamental problem will occur frequently in various contexts. 


A Fundamental Problem 


Let A be a fixed р x у matrix. Find all у х ] matrices b such that the system of equations дұ = h is consistent. 


If A is an invertible matrix, Theorem 1.6.2 completely solves this problem by asserting that for every m x 1 matrix b, the linear system 4x = h has 
the unique solution x — ДЇ}. If A is not square, or if A is square but not invertible, then Theorem 1.6.2 does not apply. In these cases the matrix b 


must usually satisfy certain conditions in order for 4x = h to be consistent. The following example illustrates how the methods of Section 1.2 can 
be used to determine such conditions. 


EXAMPLE 3 Determining Consistency by Elimination + 


What conditions must b1, b2, and b3 satisfy in order for the system of equations 


Xi--X2-2-2x3 = hi 
х Бхз = 43 
2х|\+х2+ 3х3 = 43 
to be consistent? 
Solution The augmented matrix is 
112 4 
10 1 ёз 
21 3 hs 


which can be reduced to row echelon form as follows: 


0 =1 —1 Рф —— =] times the first row was added to the second and — 2 times the first row was added to the third. 


1 2 bi 
: 1 1 =з + The second row was multiplied by— 1. 


112 
0 1 1 bi—b3 + The second row was added to the third. 
000 


It is now evident from the third row in the matrix that the system has a solution if and only if b1, b2, and b3 satisfy the condition 
b3 =b = b1 =0 or bs—b 4b; 
To express this condition another way, 4x = h is consistent if and only if b is a matrix of the form 
bi 
b= ba 
bi +43 


where bı and P» are arbitrary. 


EXAMPLE 4 Determining Consistency by Elimination + 


What conditions must b1, b2, and b3 satisfy in order for the system of equations 
x1 + 2x2 + 3х3 =, 
2x1 + 5x2 + 3x3 = b2 


X1 + 8x3 = Р 
to be consistent? 
Solution The augmented matrix is 
1 2 3 hi 
2553 b 
10 8 з 


Reducing this to reduced row echelon form yields (verify) 
10 0 —-—40h,-- 1655 + 923 
010 135, — 563 — 353 Q) 
0 0 1 5b, — 2b5— b3 


In this case there are no restrictions on b1, b2, and b3, so the system has the unique solution 
x, = = 40b + 1629 + 923, xp = 136) — 550—353, хз = 5by — 2р) – b3 (3) 


for all values of 51, b2, and b3. 


What does the result in Example 4 tell you about the coefficient 
matrix of the system? 


Skills 
* Determine whether a linear system of equations has no solutions, exactly one solution, or infinitely many solutions. 
* Solve linear systems by inverting its coefficient matrix. 


* Solve multiple linear systems with the same coefficient matrix simultaneously. 


* Be familiar with the additional conditions of invertibility stated in the Equivalence Theorem. 


Exercise Set 1.6 


In Exercises 1-8, solve the system by inverting the coefficient matrix and using Theorem 1.6.2. 


1. Xj x22 
5х|+6х2=9 


Answer: 


3. х] 3х0 + х3 = 4 
2x1 2х3 6х3 = -=l 
2x1 3х9 х3 = 3 


Answer: 


xy= =], x3—4, x3— —7 
4, 5x1 + 3х2 + 2х3 = 
Зху+ 3х2 + 2х3 = 2 
х2+ х3 = 5 
5. Х+у+ 2 = 5 
х+у—4 = 10 
—4х+у+ z = 0 


Answer: 


х=1,х=5,х=—1 
6. —x-2py-3z = 
w+ ox-dd4y-4z 
w+ 3х + 77у + 92 = 
=w = 2x = 4y = 62 = 
7. 3x1 + 5х2 = 
xı 2х2 = 22 


ll 
A I м © 


Answer: 


х= 20у = 5р2, x3— — b] + 3b3 
8. x1 +2x2+ 3x3 = Ё 

2x1 + 5x2 + 5х3 ba 

3x1 + 5x2 + 8x3 = Ёз 


In Exercises 9-12, solve the linear systems together by reducing the appropriate augmented matrix. 


9, x; —5x2=4, 
3x1 + 2x2 = Ёз 
(i) 21=1, à3—4 
(i) 21= —2, à3—5 


Answer: 

D epee. Soe ol 
ORS ет 
(ii) ху = 21 х= Н 


10. —x1 +4х)+ x3 = 44 
х1+Е9х2— 2x3 
бху +4x2 = 8x3 = 43 
() #1=0, 53-1, 43=0 
(ii) b1 = —3, b2=4, а= —5 


Il 
D 
tà 


11. 4x1 — 7x2 = ё] 
xi 2х2 = 22 
(i) 21=0, 22 = 1 
(ii) = —4, b3-6 
(ш) ё1= —1, 42=3 
(iv) = = 5, #2= 1 


Answer: 
(С mom. сё 
FIT 804 45 
Gi) gam у„—28 
517—715: 72715 
Gus 2 19 2 7-157 
MPs Aa 6 
iv = 1 a3 
MET 5: 12—5 
12. x1 x34 5х3 =b] 
=x] = 2x3 =) 


2x1 + 5x3 + 4х3 = Рз 

(i) 21= 1, 3-20, à3— -1 
(i) 21=9, b3—1, b3—1 

ш) = —1, b3— – 1, 43=0 


In Exercises 13-17, determine conditions on the 2/5, if any, in order to guarantee that the linear system is consistent. 


13. %1 +3х2= 
2x, + хә = 53 


Answer: 


No conditions on b, and 54 
14, бх] = 4x5 =b] 


3x1 = 2x2 = Рз 

15. х= 2х2 5х3 = bı 
4x;—5x3--8x3 = bz 
=3x1 3х9 = 3x3 = b3 
Answer: 
b3=b1 -b7 

16. xi—-2x3— x3 = 41 
—4x; 5х2 + 2х3 = Рә 
—4x,+7xg+4x3 = 43 

17. х= X3--3x3--2x4 = J 
=—2x, + х2++5х3+ x4 = Рә 
=—3x, 2х2 4 2х3 = x4 = 43 
4x, = 3x2 х3 + 3x4 = Ёд 
Answer: 


by =b3 + ba, ba = 253 + 54 


18. Consider the matrices 


2 l 2 ХІ 
А=|2 2 —2| and х= | х2 
31 1 х3 


(a) Show that the equation 4x = x can be rewritten as (A — Г)х = 0 and use this result to solve 4x = x for x. 


(b) Solve 4x — 4x. 


In Exercises 19—20, solve the given matrix equation for X. 


19.]1 -1 1 2-1 578 
2 3 O|F=/4 0-30 1 
0 2 —1 3 5 —? 2 1 
Answer 


20.| -2 0 1 
1 


21. Let Ах = ( be a homogeneous system of n linear equations in n unknowns that has only the trivial solution. Show that if k is any positive 
integer, then the system 4*4 = Q also has only the trivial solution. 


22. Let Ах = Q be a homogeneous system of п linear equations in n unknowns, and let О be an invertible ›; x »; matrix. Show that 4x = Q has just 
the trivial solution if and only if (QA)x = 0 has just the trivial solution. 


23. Let Ах = Ъ be any consistent system of linear equations, and let x be a fixed solution. Show that every solution to the system can be written in 
the form X = Xj + xp, where xo is a solution to 4x = Q. Show also that every matrix of this form is a solution. 


24. Use part (a) of Theorem 1.6.3 to prove part (D). 


True-False Exercises 

In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 

(a) It is impossible for a linear system of linear equations to have exactly two solutions. 
Answer: 


True 


(b) If the linear system Ах = h has a unique solution, then the linear system 4x = с also must have a unique solution. 
Answer: 


True 


(c) If A and B are ру x x matrices such that AB = /,,, then BA = žy- 
Answer: 


True 


(d) If A and B are row equivalent matrices, then the linear systems 4x = ( and Bx = Q have the same solution set. 
Answer: 


True 


(e) If A is an у x » matrix and S is an » x » invertible matrix, then if x is a solution to the linear system (5 d А5)х = b, then Sx is a solution to the 


linear system Ay = Sh. 
Answer: 


True 


(f) Let A be ап» x »; matrix. The linear system Ax = 4x has a unique solution if and only if A — 4 is an invertible matrix. 
Answer: 


True 


(g) Let A and B be » x » matrices. If A or B (or both) are not invertible, then neither is AB. 
Answer: 


True 
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1.7 Diagonal, Triangular, and Symmetric Matrices 


In this section we will discuss matrices that have various special forms. These matrices arise in a wide variety of applications 
and will also play an important role in our subsequent work. 
Diagonal Matrices 


A square matrix in which all the entries off the main diagonal are zero is called a diagonal matrix. Here are some examples: 


6 000 
0 0 2 0 TF 0 —4 0 
gor P =F fyg |» 8? 
0 00 


A general » x y diagonal matrix D can be written as 


со © © 


di 0 
0 d 0 
р ig (1) 
0 0 .. dy 
A diagonal matrix is invertible if and only if all of its diagonal entries are nonzero; in this case the inverse of 1 is 
lid 0 .. Ü 
= 0 lidz ... 0 
о (2) 
0 0 ... 1/4 
Confirm Formula 2 by showing that 
DDi-pip-i 
Powers of diagonal matrices are easy to compute; we leave it for you to verify that if D is the diagonal matrix 1 and kis a 
positive integer, then 
dt 0 0 
k_| 0 d. 0 
Bn 2 = (3) 
0 0 „4 


EXAMPLE 1 Inverses and Powers of Diagonal Matrices — 


If 


then 


1 00 1 0 0 
1 1 0 0 1 
412|9 73 9. zio 223 of 49-2|9 73 °’ 
1 0 0 32 1 
0 ol оо d 


Matrix products that involve diagonal factors are especially easy to compute. For example, 
di 0 0 |а ауу аз а уау 41212 Фаз ауд 
0 dz 0 ||а21 222 423 924 азад dan 42423 @2а24 
0 0 2з |1231 032 433 234 dia @зазз d3a33 азазд 


diaj) 2212 заз 
01] 012 413 а 0 0 


an an аз „ „| = djaz dam d3a73 
аз) аз) 433 4 


djaz) d3a32 Ф@зазз 
ад ар аз||09 0 43 


аад dagg зад 


In words, to multiply a matrix A on the left by a diagonal matrix D, опе can multiply successive rows of A by the 
successive diagonal entries of D, and to multiply A on the right by D, one can multiply successive columns of A by the 
successive diagonal entries of D. 


Triangular Matrices 


A square matrix in which all the entries above the main diagonal are zero is called lower triangular, and a square matrix in 
which all the entries below the main diagonal are zero is called upper triangular. ^ matrix that is either upper triangular or 
lower triangular is called triangular. 


EXAMPLE 2 Upper and Lower Triangular Matrices — 


ап di) аз ац a; 0 0 0 

О an az ay ау a 0 0 4 
0 0 аз аза аз 432 аз 0 

0 0 0 44 a4, 442 йаз ад 

A general 4 x 4 upper A general t 4 lower 
triangular matrix triangular matrix 


Remark Observe that diagonal matrices are both upper triangular and lower triangular since they have zeros below and 
above the main diagonal. Observe also that a square matrix in row echelon form is upper triangular since it has zeros below 
the main diagonal. 


Properties of Triangular Matrices 


Example 2 illustrates the following four facts about triangular matrices that we will state without formal proof. 


* A square matrix 4 = [25] is upper triangular if and only if all entries to the left of the main diagonal are zero; that is, 
ар = 0 ifi > j (Figure 1.7.1). 


* Asquare matrix A= [215] is lower triangular if and only if all entries to the right of the main diagonal are zero; that is, 
aj; = Ü ifi < j (Figure 1.7.1). 


* A square matrix А = [а;;] is upper triangular if and only if the ith row starts with at least j — 1 zeros for every i. 


* A square matrix А = [а;;] is lower triangular if and only if the jth column starts with at least j — 1 zeros for every j. 


Figure 1.7.1 


The following theorem lists some of the basic properties of triangular matrices. 


THEOREM 1.7.1 


(a) The transpose of a lower triangular matrix is upper triangular, and the transpose of an upper triangular matrix is 
lower triangular. 


(b) The product of lower triangular matrices is lower triangular, and the product of upper triangular matrices is upper 
triangular. 
(c) Atriangular matrix is invertible if and only if its diagonal entries are all nonzero. 


(d) The inverse of an invertible lower triangular matrix is lower triangular, and the inverse of an invertible upper 
triangular matrix is upper triangular. 


Part (a) 15 evident from the fact that transposing a square matrix can be accomplished by reflecting the entries about the main 
diagonal; we omit the formal proof. We will prove (b), but we will defer the proofs of (c) and (d) to the next chapter, where 
we will have the tools to prove those results more efficiently. 


Proof (b) We will prove the result for lower triangular matrices; the proof for upper triangular matrices is similar. Let 
A= [aij] and 8 = [5j] be lower triangular р x »; matrices, and let C = [cj] be the product С' = 4p. We can prove that C 
is lower triangular by showing that Суу = 0 for i < j. But from the definition of matrix multiplication, 


Ci —agbi ! aiba poros ази; 
If we assume that i < j, then the terms in this expression can be grouped as follows: 
ej = азё; | ajb Р. aij Dy? 10у | аууу Fess 4 Ain yy 
Terms in which the row Terms in which the row 
number of b is less than the number of а is less than 
column number of 5 the column number of а 


In the first grouping all of the 5 factors are zero since В is lower triangular, and in the second grouping all of the a factors are 
zero since А is lower triangular. Thus, Суу = 0, which is what we wanted to prove. 


EXAMPLE 3 Computations with Triangular Matrices + 


Consider the upper triangular matrices 


13 -1 3 —2 
А=|0 2 4], 8=|0 0 —1 
00 5 0 0 1] 


It follows from part (c) of Theorem 1.7.1 that the matrix A is invertible but the matrix В is not. Moreover, the 
theorem also tells us that 4 71, АВ, and BA must be upper triangular. We leave it for you to confirm these three 
statements by showing that 


LR. 
2 5 3 —2 —2 35 —1 
At=|0 i -$. 4B-|0 o 2| BA-|0 0 —5 
0 0 5 00 5 

оо 1 


Symmetric Matrices 


DEFINITION 1 


A square matrix A is said to be symmetric if 4 — АТ. 


Itis easy to recognize a symmetric matrix by 
inspection: The entries on the main diagonal have no 
restrictions, but mirror images of entries across the 
main diagonal must be equal. Here is a picture using 
the second matrix in Example 4: 


АП diagonal matrices, such as the third matrix in 
Example 4, obviously have this property. 


EXAMPLE 4 Symmetric Matrices — 


The following matrices are symmetric, since each is equal to its own transpose (verify). 


di 0 0 0 
[2-3], Ja so 0 d; 0 0 
3 sf [504 0 0 dz 0 


Remark It follows from Formula 11 of Section 1.3 that a square matrix А = [2:5] is symmetric if and only if 


(A) = em (4) 


for all values of i and j. 


The following theorem lists the main algebraic properties of symmetric matrices. The proofs are direct consequences of 
Theorem 1.4.8 and are omitted. 


THEOREM 1.7.2 


If A and B are symmetric matrices with the same size, and if k is any scalar, then: 
(a) АТ is symmetric. 

(b) A+ Band А — B аге symmetric. 

(c) kA is symmetric. 


It is not true, in general, that the product of symmetric matrices is symmetric. To see why this is so, let 4 and B be symmetric 
matrices with the same size. Then it follows from part (e) of Theorem 1.4.8 and the symmetry of А and B that 


(AB) T = BT AT — ВА 
Thus, (45) T _ AB if and only if 45 = BA, that is, if and only if A and В commute. In summary, we have the following 


result. 


THEOREM 1.7.3 


The product of two symmetric matrices is symmetric if and only if the matrices commute. 


EXAMPLE 5 Products of Symmetric Matrices <4 


The first of the following equations shows a product of symmetric matrices that is not symmetric, and the 
second shows a product of symmetric matrices that is symmetric. We conclude that the factors in the first 
equation do not commute, but those in the second equation do. We leave it for you to verify that this is so. 


a[o] = [55 2 
BB EFHESEE 


Invertibility of Symmetric Matrices 


In general, a symmetric matrix need not be invertible. For example, a diagonal matrix with a zero on the main diagonal is 


symmetric but not invertible. However, the following theorem shows that if a symmetric matrix happens to be invertible, then 
its inverse must also be symmetric. 


THEOREM 1.7.4 


If A is an invertible symmetric matrix, then 4 71 is symmetric. 


Proof Assume that А is symmetric and invertible. From Theorem 1.4.9 and the fact that 4 — 4 T. we have 


(47) = (ат) 243 


which proves that 47! is symmetric. 


Products AA! and АТА 


Matrix products of the form AAT and АТА arise in a variety of applications. If A is an jj; x » matrix, then Al is an ихт 
matrix, so the products AAT and АТА are both square matrices—the matrix AAT has size m x mm, and the matrix АТА has size 
и xn. Such products are always symmetric since 


(aa) = (47) AT = aa? and (ATA) = 47 (47)' = АТА 


EXAMPLE 6 The Product of a Matrix and Its Transpose Is Symmetric — 


Let A be the 2 x 3 matrix 


Then 
1 3 10 -2 -11 
АТА=|-2 0 E E: | = |-2 4 —8 
4 —5 n -11 -8 41 
1 3 
т [1-2 4 _ 21 –17 
ч =|; 0 M P E i E: H 


Observe that А74 and AA! are symmetric as expected. 


Later in this text, we will obtain general conditions on А under which AAT and АТА are invertible. However, in the special 
case where А is square, we have the following result. 


THEOREM 1.7.5 


If A is an invertible matrix, then ААТ and АТА are also invertible. 


Proof Since А is invertible, so is A’ by Theorem 1.4.9. Thus 447 and АТА are invertible, since they are the products of 
invertible matrices. 


Concept Review 

* Diagonal matrix 

* Lower triangular matrix 
* Upper triangular matrix 
* Triangular matrix 


* Symmetric matrix 


Skills 

* Determine whether a diagonal matrix is invertible with no computations. 

* Compute matrix products involving diagonal matrices by inspection. 

* Determine whether a matrix is triangular. 

* Understand how the transpose operation affects diagonal and triangular matrices. 
* Understand how inversion affects diagonal and triangular matrices. 


* Determine whether a matrix is a symmetric matrix. 


Exercise Set 1.7 


In Exercises 1—4, determine whether the given matrix is invertible. 


4 


Answer: 
1 
2 0 
1 
uis 
2.1400 
ооо 
005 
3.| -1 0 0 
02 0 
1 
007 


Answer: 


-1 00 
L 

0 5 0 

0 0 3 
41-10 0 0 
оз о 0 
00-3 0 
00 0-2 


5.13 00 21 
0 —1 0||-4 1 
0 02 2.5 
Answer 
6 3 
4 -1 
4 10 
6 1 2 5174 0 0 
ae 0 030 
002 
7/5 0 01-3 204 -4 
02 0 1-5 30 3 
0 0 =3| -6 222 2 
Answer: 
=15 10 0 20 =20 
2 =10 6 0 6 
18 -6 =6 -6 –6 
8. 0 0 4-1 31-3 0 


04|-5 1 -2 


In Exercises 9-12, find 42, 472, and 4 —* (where К is any integer) by inspection. 


ә. [1 0 
А= 


Answer: 


п. 


1 
5 0 0 
e T 
A=|0 = 0 
1 
0 04 
Answer: 
1 
t o o 
4 40 0 2 0 0 
4%=| 2 o| А? =|0о 0 of, A*=| o 3* о 
at 0 0 16 0 0 4* 
16 
12. Е Е, TD 
0-4 00 
m 0 0-30 
0 0 02 


In Exercises 13—19, decide whether the given matrix is symmetric. 


13.| -8 —8 
0 00 
Answer: 


Not symmetric 
e E 

b- 2 
| 0 p 

—j 7 


Answer: 


Answer: 


Not symmetric 


18.| 2 —1 3 
=1 51 
du. dU 

19.|0 0 1 
020 
300 


Answer: 


Not symmetric 


In Exercises 20—22, decide by inspection whether the given matrix is invertible. 


20.]|-1 2 4 
030 
0 0 5 
21.|0 1 —2 5 
0 1 5 6 
0 0 =3 1 
0 O0- 5 
Answer: 
Not invertible 
22. 2 00 0 
-3 —-10 0 
—4 —6 0 0 
0 38 —5 


In Exercises 23—24, find all values of the unknown constant(s) in order for А to be symmetric. 


ва 1] 


а+5 —1 
Answer: 
a= – 8 
24. 2 2-25 + 2с 2a+b+e 
4А= |3 5 ate 
0 —2 7 


In Exercises 25—26, find all values of x in order for A to be invertible. 


25 x-1 x? x4 
A=! 9 x+2 x) 
0 0 х—4 
Answer: 
x#1, —2,4 
2 x-i 0 0 
- _ 1 
А= x x 3 0 
x? x? x-i 


27 1 о 0 
A^-|0 -1 
йб 


Answer: 


1 0 0 
0-1 0 
0 0 —1 
28. 900 
A3-|040 
00 1 
29. Verify Theorem 1.7.1(5) for the product AB, where 
-1 2 5 2 —8 0 
А=| 01 3| 8=|0 2 1 
0 0 —4 0 03 


30. Verify Theorem 1.7.1(d) for the matrices А and В in Exercise 29. 
31. Verify Theorem 1.7.4 for the given matrix A. 


© 4-| 2 E! 


xp 3 
(b) ] 3 
A=|-2 1 -7 
es cd 


32. Let A be an у х зу symmetric matrix. 
(a) Show that A” is symmetric. 
(b) Show that 2 4? — 3,44 j is symmetric. 
33. Prove: If 47 А = 4, then А is symmetric and 4 = 42. 
34. Find all 3 3 diagonal matrices A that satisfy 4? — 34 — 47 — 0. 
35. Let A= [ату] be an у x x matrix. Determine whether A is symmetric. 
(а) aj; =i? 72 
(b) а= f=? 
(c) @ij = 2i + 2j 
(d) ayy = 2i? - 2? 


Answer: 


(a) Yes 
(b) No (unless x = 1) 
(c) Yes 
(d) No (unless » = 1) 
36. On the basis of your experience with Exercise 35, devise a general test that can be applied to a formula for aj; to determine 
whether А = [aj] is symmetric. 


37. A square matrix A is called skew-symmetric if АТ = — А. 


Prove: 


(a) IfA is an invertible skew-symmetric matrix, then 47! is skew-symmetric. 


(b) IfA and B are skew-symmetric matrices, then so are AT, А-В, A— B, and XA for any scalar k. 


(c) Every square matrix А can be expressed as the sum of a symmetric matrix and a skew-symmetric matrix. [Hint: Note 


the identity 4= 2 (4 | АТ) | Ha-a") 


In Exercises 38—39, fill in the missing entries (marked with x) to produce a skew-symmetric matrix. 


38 x x 4 
А=| 0 x x 
x —l x 
39. x 0 x 
A=|x x —4 
8 x x 
Answer: 

00 —8 

00 —4 

84 0 


40. Find all values of a, b, c, and d for which A is skew-symmetric. 
0 2a—3b-Fec 3a-—5b-F5c 
А=|—2 0 5а — 8b + 6c 
-3 -5 d 
41. We showed in the text that the product of symmetric matrices is symmetric if and only if the matrices commute. Is the 


product of commuting skew-symmetric matrices skew- symmetric? Explain. [Note: See Exercise 37 for the deffinition of 
skew-symmetric. | 


42. If the » x x matrix A can be expressed as 4 — LU, where L is a lower triangular matrix and U is an upper triangular 
matrix, then the linear system Ах — h can be expressed as 7, 77x — h and can be solved in two steps: 


Step 1. Let Ux = y, so that Z Ux = h can be expressed as Лу = b. Solve this system. 
Step 2. Solve the system Ux — y for x. 


In each part, use this two-step method to solve the given system. 


(a) 100|2 —1 3 [х1 1 
—2 3 00 12|Z2|2|-2 
241110 0 4||#3 0 
(b) 2 003 —5 2|[Z1 4 
4 10110 4 1/|Z2|2|—5 
—3 —2 3||0 0 2]|[|*3 2 
43. Find an upper triangular matrix that satisfies 
1 30 
АЗ = 
[o ^| 
Answer: 
1 10 
А= 
b 2 


True-False Exercises 
In parts (a)-(m) determine whether the statement is true or false, and justify your answer. 


(a) The transpose of a diagonal matrix is a diagonal matrix. 


Answer: 


True 


(b) The transpose of an upper triangular matrix is an upper triangular matrix. 
Answer: 


False 


(c) The sum of an upper triangular matrix and a lower triangular matrix is a diagonal matrix. 
Answer: 


False 


(d) АП entries of a symmetric matrix are determined by the entries occurring on and above the main diagonal. 
Answer: 


True 


(e) АП entries of an upper triangular matrix are determined by the entries occurring on and above the main diagonal. 
Answer: 


True 


(f) The inverse of an invertible lower triangular matrix is an upper triangular matrix. 
Answer: 


False 


(g) A diagonal matrix is invertible if and only if all of its diagonal entries are positive. 
Answer: 


False 


(h) The sum of a diagonal matrix and a lower triangular matrix is a lower triangular matrix. 
Answer: 


True 


(i) A matrix that is both symmetric and upper triangular must be a diagonal matrix. 
Answer: 


True 


(j) If A and B are » x ; matrices such that А + B is symmetric, then А and В are symmetric. 
Answer: 


False 


(k) If A and B are » x у matrices such that А + P is upper triangular, then А and В are upper triangular. 
Answer: 


False 


(0 if 4? is a symmetric matrix, then 4 is a symmetric matrix. 


Answer: 


False 


(m) If КА is a symmetric matrix for some & + 0), then А is a symmetric matrix. 
Answer: 


True 


Copyright (O 2010 John Wiley & Sons, Inc. All rights reserved. 


1.8 Applications of Linear Systems 


In this section we will discuss some relatively brief applications of linear systems. These are but a small sample of the wide 
variety of real-world problems to which our study of linear systems is applicable. 


Network Analysis 


The concept of a network appears in a variety of applications. Loosely stated, a network is a set of branches through which 
something "flows." For example, the branches might be electrical wires through which electricity flows, pipes through 
which water or oil flows, traffic lanes through which vehicular traffic flows, or economic linkages through which money 
flows, to name a few possibilities. 


In most networks, the branches meet at points, called nodes or junctions, where the flow divides. For example, in an 
electrical network, nodes occur where three or more wires join, in a traffic network they occur at street intersections, and in 
a financial network they occur at banking centers where incoming money is distributed to individuals or other institutions. 


In the study of networks, there is generally some numerical measure of the rate at which the medium flows through a 
branch. For example, the flow rate of electricity is often measured in amperes, the flow rate of water or oil in gallons per 
minute, the flow rate of traffic in vehicles per hour, and the flow rate of European currency in millions of Euros per day. 
We will restrict our attention to networks in which there is flow conservation at each node, by which we mean that the rate 
of flow into any node is equal to the rate of flow out of that node. This ensures that the flow medium does not build up at 
the nodes and block the free movement of the medium through the network. 


A common problem in network analysis is to use known flow rates in certain branches to find the flow rates in all of the 
branches. Here is an example. 


EXAMPLE 1 Network Analysis Using Linear Systems — 


Figure 1.8.1 shows a network with four nodes in which the flow rate and direction of flow in certain 
branches are known. Find the flow rates and directions of flow in the remaining branches. 


30 
35 55 


60 
Figure 1.8.1 
Solution As illustrated in Figure 1.8.2, we have assigned arbitrary directions to the unknown flow rates 


х1, хэ, and X3. We need not be concerned if some of the directions are incorrect, since an incorrect direction 
will be signaled by a negative value for the flow rate when we solve for the unknowns. 


60 
Figure 1.8.2 


It follows from the conservation of flow at node A that 
xi +x2= 30 
Similarly, at the other nodes we have 
xa х3 = 35 (node B) 
x3+15=60 (node C) 
х] 15 = 55 (node D) 


These four conditions produce ће linear system 


xi + х2 = 30 
х2+хз=35 

хз =45 

ži =40 


which we can now try to solve for the unknown flow rates. In this particular case the system is sufficiently 
simple that it can be solved by inspection (work from the bottom up). We leave it for you to confirm that the 
solution is 


x|—40, x= = 10, х3=45 


The fact that x2 is negative tells us that the direction assigned to that flow in Figure 1.8.2 is incorrect; that is, 
the flow in that branch is into node А. 


EXAMPLE 2 Design of Traffic Patterns — 


The network in Figure 1.8.3 shows a proposed plan for the traffic flow around a new park that will house the 
Liberty Bell in Philadelphia, Pennsylvania. The plan calls for a computerized traffic light at the north exit on 
Fifth Street, and the diagram indicates the average number of vehicles per hour that are expected to flow in 
and out of the streets that border the complex. All streets are one-way. 


(a) How many vehicles per hour should the traffic light let through to ensure that the average number of 
vehicles per hour flowing into the complex is the same as the average number of vehicles flowing out? 


(b) Assuming that the traffic light has been set to balance the total flow in and out of the complex, what can 
you say about the average number of vehicles per hour that will flow along the streets that border the 
complex? 
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EN 200 § Traffic 200 x 
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X| Park |= «ү ] 
700 < o jiu E 400 700 < e 4 + < 400 
Chestnut St. D X, A 
А А 
600 600 
(a) (b) 
Figure 1.8.3 
Solution 


(a) If, as indicated in Figure 1.8.35 we let x denote the number of vehicles per hour that the traffic light must 
let through, then the total number of vehicles per hour that flow in and out of the complex will be 


Flowing in: 500 - 400 + 600 + 200 = 1700 
Flowing out: x 4- 700 4- 400 


Equating the flows in and out shows that the traffic light should let x = 600 vehicles per hour pass 
through. 


To avoid traffic congestion, the flow in must equal the flow out at each intersection. For this to happen, 
the following conditions must be satisfied: 


(b 


— 


Intersection Flow In Flow Out 

~ A 4004600 = zxi*x3 
B *24+%3 = 400+х 
С 5004-200 = х3 +х4 
D Х1+Х74 = 700 


Thus, with x — 600, as computed in part (a), we obtain the following linear system: 


xi + х2 = 1000 
хэ + X3 = 1000 
x3+x4= 700 

X1 +x4= 700 


We leave it for you to show that the system has infinitely many solutions and that these are given by the 
parametric equations 


xı =700 —£, x5 = 300 4- £, x3— 700—1, x4—t (1) 


However, the parameter ¢ is not completely arbitrary here, since there are physical constraints to be 
considered. For example, the average flow rates must be nonnegative since we have assumed the streets 
to be one-way, and a negative flow rate would indicate a flow in the wrong direction. This being the 
case, we see from 1 that ¢ can be any real number that satisfies 0 < £ < 700, which implies that the 
average flow rates along the streets will fall in the ranges 


0&<x1 £700, 300<x2=1000, 0zx32z700, 0<х4<700 


Electrical Circuits 


Next, we will show how network analysis can be used to analyze electrical circuits consisting of batteries and resistors. A 
battery is a source of electric energy, and a resistor, such as a lightbulb, is an element that dissipates electric energy. Figure 
1.8.4 shows a schematic diagram of a circuit with one battery (represented by the symbol {JH} one resistor (represented by 


the symbol ,,—), and a switch. The battery has a positive pole (+) and a negative pole (—). When the switch is closed, 
electrical current is considered to flow from the positive pole of the battery, through the resistor, and back to the negative 
pole (indicated by the arrowhead in the figure). 


+ 


Switch 


Figure 1.8.4 


Electrical current, which is a flow of electrons through wires, behaves much like the flow of water through pipes. A battery 
acts like a pump that creates "electrical pressure" to increase the flow rate of electrons, and a resistor acts like a restriction 
in a pipe that reduces the flow rate of electrons. The technical term for electrical pressure is electrical potential; it is 
commonly measured in volts (V). The degree to which a resistor reduces the electrical potential is called its resistance and 
is commonly measured in ohms (Q). The rate of flow of electrons in a wire is called current and is commonly measured in 
amperes (also called amps) (A). The precise effect of a resistor is given by the following law: 


Ohm's Law 


If a current of J amperes passes through a resistor with a resistance of R ohms, then there is a resulting drop of E 
volts in electrical potential that is the product of the current and resistance; that is, 


E-—IR 


A typical electrical network will have multiple batteries and resistors joined by some configuration of wires. А point at 
which three or more wires in a network are joined is called a node (or junction point). A branch is a wire connecting two 
nodes, and a closed loop is a succession of connected branches that begin and end at the same node. For example, the 
electrical network in Figure 1.8.5 has two nodes and three closed loops— two inner loops and one outer loop. As current 
flows through an electrical network, it undergoes increases and decreases in electrical potential, called voltage rises and 
voltage drops, respectively. The behavior of the current at the nodes and around closed loops is governed by two 
fundamental laws: 


+ + 


Figure 1.8.5 


Kirchhoff's Current Law 


The sum of the currents flowing into any node is equal to the sum of the currents flowing out. 


Kirchhoff's Voltage Law 


In one traversal of any closed loop, the sum of the voltage rises equals the sum of the voltage drops. 


Kirchhoff current law is a restatement of the principle of flow conservation at a node that was stated for general networks. 
Thus, for example, the currents at the top node in Figure 1.8.6 satisfy the equation / = 75 + 74. 


Figure 1.8.6 


In circuits with multiple loops and batteries there is usually no way to tell in advance which way the currents are flowing, 
so the usual procedure in circuit analysis is to assign arbitrary directions to the current flows in the branches and let the 
mathematical computations determine whether the assignments are correct. In addition to assigning directions to the 
current flows, Kirchhoffs voltage law requires a direction of travel for each closed loop. The choice is arbitrary, but for 
consistency we will always take this direction to be clockwise (Figure 1.8.7). We also make the following conventions: 


* A voltage drop occurs at a resistor if the direction assigned to the current through the resistor is the same as the direction 
assigned to the loop, and a voltage rise occurs at a resistor if the direction assigned to the current through the resistor is 
the opposite to that assigned to the loop. 


* A voltage rise occurs at a battery if the direction assigned to the loop is from — to + through the battery, and a voltage 
drop occurs at a battery if the direction assigned to the loop is from + to — through the battery. 


If you follow these conventions when calculating currents, then those currents whose directions were assigned correctly 
will have positive values and those whose directions were assigned incorrectly will have negative values. 


a8 


+ - 
| Clockwise closed-loop 
convention with arbitrary 
| direction assignments to 
currents in the branches 


Figure 1.8.7 


EXAMPLE 3 ACircuit with One Closed Loop 3 


Determine the current J in the circuit shown in Figure 1.8.8. 


I 


+ 
бу 6) 3€ 


Figure 1.8.8 


Solution Since the direction assigned to the current through the resistor is the same as the direction of the 
loop, there is a voltage drop at the resistor. By Ohm's law this voltage drop is Ẹ — 7g = 37. Also, since the 
direction assigned to the loop is from — to + through the battery, there is a voltage rise of 6 volts at the 
battery. Thus, it follows from Kirchhoffs voltage law that 

3i=6 
from which we conclude that the current is 7 = 2 А. Since I is positive, the direction assigned to the current 
flow is correct. 


EXAMPLE 4 A Circuit with Three Closed Loops <4 


Determine the currents /,, 75, and / in the circuit shown in Figure 1.8.9. 


+ B + 
50у 30V 


Figure 1.8.9 


Solution Using the assigned directions for the currents, Kirchhoff s current law provides one equation for 


each node: 
Node Current In Current Out 
A HT = I4 
B 13 = Hi 


However, these equations are really the same, since both can be expressed as 


р = 13 = 0 (2) 


Gustav Kirchhoff (1824-1887) 


Historical Note The German physicist Gustav Kirchhoff was a student of Gauss. His work on 
Kirchhoff's laws, announced in 1854, was a major advance in the calculation of currents, voltages, 
and resistances of electrical circuits. Kirchhoff was severely disabled and spent most of his life on 
crutches or in a wheelchair. 

Image: O SSPL/The Image Works] 


To find unique values for the currents we will need two more equations, which we will obtain from 
Kirchhoffs voltage law. We can see from the network diagram that there are three closed loops, a left inner 
loop containing the 50 V battery, a right inner loop containing the 30 V battery, and an outer loop that 
contains both batteries. Thus, Kirchhoffs voltage law will actually produce three equations. With a 
clockwise traversal of the loops, the voltage rises and drops in these loops are as follows: 


Voltage Rises Voltage Drops 


Left Inside Loop 50 5/1 + 20/3 
Right Inside Loop 20 + 1077 + 2073 0 
Outside Loop 30 + 50 + 107; 5h 
These conditions can be rewritten as 
5H +201; = 50 
10/5 +2073 = —30 (3) 
5I, = 1075 = 80 


However, the last equation is superfluous, since it is the difference of the first two. Thus, if we combine 2 
and the first two equations in 3, we obtain the following linear system of three equations in the three 
unknown currents: 


h+ h- h = 0 
5H +201; = 50 
10/5 +2073 = —30 


We leave it for you to solve this system and show that /; = 6 А, 7; = — 5 A, and 73 = 1А. The fact that 2) 
is negative tells us that the direction of this current is opposite to that indicated in Figure 1.8.9. 


Balancing Chemical Equations 


Chemical compounds are represented by chemical formulas that describe the atomic makeup of their molecules. For 
example, water is composed of two hydrogen atoms and one oxygen atom, so its chemical formula is H20; and stable 
oxygen is composed of two oxygen atoms, so its chemical formula is Оз. 


When chemical compounds are combined under the right conditions, the atoms in their molecules rearrange to form new 
compounds. For example, when methane burns, the methane (CH4) and stable oxygen (O2) react to form carbon dioxide 
(CO2) and water (H20). This is indicated by the chemical equation 


CH4 + O3 — CO; + H20 (4) 


The molecules to the left of the arrow are called the reactants and those to the right the products. In this equation the plus 
signs serve to separate the molecules and are not intended as algebraic operations. However, this equation does not tell the 
whole story, since it fails to account for the proportions of molecules required for a complete reaction (no reactants left 
over). For example, we can see from the right side of 4 that to produce one molecule of carbon dioxide and one molecule 
of water, one needs three oxygen atoms for each carbon atom. However, from the left side of 4 we see that one molecule of 
methane and one molecule of stable oxygen have only two oxygen atoms for each carbon atom. Thus, on the reactant side 
the ratio of methane to stable oxygen cannot be one-to-one in a complete reaction. 


A chemical equation is said to be balanced if for each type of atom in the reaction, the same number of atoms appears on 
each side of the arrow. For example, the balanced version of Equation 4 is 


CH4 + 203 — СОз + 2H30 (5) 


by which we mean that one methane molecule combines with two stable oxygen molecules to produce one carbon dioxide 
molecule and two water molecules. In theory, one could multiply this equation through by any positive integer. For 
example, multiplying through by 2 yields the balanced chemical equation 

2CHy + 405 — 2CO5 + 4H50 


However, the standard convention is to use the smallest positive integers that will balance the equation. 


Equation 4 is sufficiently simple that it could have been balanced by trial and error, but for more complicated chemical 
equations we will need a systematic method. There are various methods that can be used, but we will give one that uses 
systems of linear equations. To illustrate the method let us reexamine Equation 4. To balance this equation we must find 
positive integers, x1, x3, хз, and х4 such that 


x1(CH4) + х2(02) — x3(CO2) + x4(H20) (6) 


For each of the atoms in the equation, the number of atoms on the left must be equal to the number of atoms on the right. 
Expressing this in tabular form we have 


Left Side Right Side 


Carbon ^1 = х3 
Hydrogen 4x} = 2X4 
Oxygen 2x2 = 2х3+х4 
from which we obtain the homogeneous linear system 
Х| — X3 = 
4х\ = 2х4= 0 


2x3 = 253 = x4-—0 


The augmented matrix for this system is 


10-1 00 


40 0-20 
02 -2 —1 0 
We leave it for you to show that the reduced row echelon form of this matrix is 

1 

100 5 0 

010 -10 
T 

001 7 0 


from which we conclude that the general solution of the system is 
x|—1/2, xg=t, хз=112, x4-—t 


where t is arbitrary. The smallest positive integer values for the unknowns occur when we let ¢ — 2, so the equation can be 
balanced by letting x = 1, хэ = 2, хз = 1, хд = 2. This agrees with our earlier conclusions, since substituting these 
values into Equation 6 yields Equation 5. 


EXAMPLE 5 Balancing Chemical Equations Using Linear Systems + 


Balance the chemical equation 
HCl + Na3PO4 — H3P04 + NaCl 
[hydrochloric acid] + [sodium phosphate] — [phosphoric acid] + [sodium chloride] 


Solution Letx;, хэ, хз, and X4 be positive integers that balance the equation 
xi(HCD + х3(МазРОд) — x3(H3PO4) + x 4(NaCl) (7) 


Equating the number of atoms of each type on the two sides yields 
lx; =3x3 Hydrogen(H) 
1х1 = 1x4 Chlorine(C]) 
3x3= 1x4 Sodium(Na) 
1х3 = 1x3 Phosphorous(P) 
4x3=4x3 Oxygen(O) 


from which we obtain the homogeneous linear system 


хр =—3x3 =0 
Х| =х4= 0 
3х2 —x4-—0 
х2— X3 =0 
4x3 — 4х3 =0 
We leave it for you to show that the reduced row echelon form of the augmented matrix for this system is 
100 —1 0 
010-5 0 
001-410 
000 0 0 
000 0 0 


from which we conclude that the general solution of the system is 


х=, xg=£/3, хз=1!3, x4-—t 
where t is arbitrary. To obtain the smallest positive integers that balance the equation, we let  — 3, in which 
case we obtain x; = 5, хэ = 1, хз = 1, and x 4 = 5. Substituting these values in 7 produces the balanced 
equation 


3HCI + Na3PO4 — НзРО + 3NaCl 


Polynomial Interpolation 


An important problem in various applications is to find a polynomial whose graph passes through a specified set of points 
in the plane; this is called an interpolating polynomial for the points. The simplest example of such a problem is to find a 
linear polynomial 


р(х) =ах + (8) 


whose graph passes through two known distinct points, (x1, y1) and (x4, уз), in ће xy-plane (Figure 1.8.10). You have 
probably encountered various methods in analytic geometry for finding the equation of a line through two points, but here 
we will give a method based on linear systems that can be adapted to general polynomial interpolation. 


y 
y=ax+b 


Figure 1.8.10 


The graph of 8 is the line у = ах + b, and for this line to pass through the points (хү, y1) and (хэ, уз), we must have 
yy =axy+d and y3-—axj3--b 
Therefore, the unknown coefficients a and b can be obtained by solving the linear system 
axı +2 = у 
аха + = у2 
We don't need апу fancy methods to solve this system—the value of a сап be obtained by subtracting the equations to 


eliminate Б, and then the value of a can be substituted into either equation to find 5. We leave it as an exercise for you to 
find a and 5 and then show that they can be expressed in the form 


MIEN. ad ь—71Х2—У2Х1 


а= == 


x2—X1 x2—X| (9) 
provided x * х2. Thus, for example, the line y = gx + ù that passes through the points 
(2,1) and (5,4) 
can be obtained by taking (х, у) = (2, 1) and (x, уз) = (5, 4), in which case 9 yields 
4—1 =1 and p24300-000 _ =] 
5 


dui im = 
Therefore, the equation of the line is 
у=х—1 


(Figure 1.8.11). 


Figure 1.8.11 


Now let us consider the more general problem of finding a polynomial whose graph passes through n points with distinct 
x-coordinates 


(х1, ¥1), G2, У2), G3, У3), s (Хх, Yn) (10) 
Since there аге л conditions to be satisfied, intuition suggests that we should begin by looking for a polynomial of the form 
= 2 n-i 
р(х) =ag + aix + ax" +...+ay-1% (11) 
since a polynomial of this form has n coefficients that are at our disposal to satisfy the n conditions. However, we want to 
allow for cases where the points may lie on a line or have some other configuration that would make it possible to use a 
polynomial whose degree is less than у — 1; thus, we allow for the possibility that 45, —1 and other coefficients in 11 may 


be zero. 


The following theorem, which we will prove later in the text, is the basic result on polynomial interpolation. 


THEOREM 1.8.1 Polynomial Interpolation 


Given any n points in the xy-plane that have distinct x-coordinates, there is a unique polynomial of degree n — 1 
or less whose graph passes through those points. 


Let us now consider how we might go about finding the interpolating polynomial 11 whose graph passes through the points 
in 10. Since the graph of this polynomial is the graph of the equation 


у= 40 1х 4 азх? Р... fig ax (12) 


it follows that the coordinates of the points must satisfy 


ap + a1x1d axi F... ай = у 
2 п—1 __ 

ag + aix + 22х53 +...+4y-1%3 = 2 йв) 
2 n- _ 

ай Faxy + @2Ху d... Ay-1%,_ = уу 


In these equations the values of x's and y's are assumed to be known, so we can view this as a linear system in ће 
unknowns ap, 44, --„ @,—1- From this point of view the augmented matrix for the system is 


-> 


1 x x su ir yi 

2 -1 
1 x2 xj .. xj Уз (14) 
1 xg rr --- rm Yn 


and hence the interpolating polynomial can be found by reducing this matrix to reduced row echelon form (Gauss-Jordan 
elimination). 


EXAMPLE 6 Polynomial Interpolation by Gauss-Jordan Elimination + 


Find a cubic polynomial whose graph passes through the points 
(1,3), (2, 22), ©, —5), (4,0) 


Solution Since there are four points, we will use an interpolating polynomial of degree ;; = 3. Denote this 
polynomial by 
p(x)-—ag-Faix4 азх? ! азх? 
and denote the x- and y-coordinates of the given points by 
xy=l, x2—2, x3=3, x4—4 and yy =3, уз= —2, y3— —5, yg =D 


Thus, it follows from 14 that the augmented matrix for the linear system in the unknowns ap, a1, &2, and 23 
18 


1 ay x? xi yi 
Ж 11 1 1 3 
12233 x; 2| |12 4 8 -2 
1 хз x2 х3 уз 13 9 27 —5 
14 16 64 0 


X4 XA XA Yå 


We leave it for you to confirm that the reduced row echelon form of this matrix is 


1000 4 
0100 3 
0010 -5 
0001 1 
from which it follows that ay = 4, 2] = 3, a5; = — 5, аз = 1. Thus, the interpolating polynomial is 


р(х) =4+ 3x- 5x4 "ex? 
The graph of this polynomial and the given points are shown in Figure 1.8.12. 


y 


Figure 1.8.12 


Remark Later we will give a more efficient method for finding interpolating polynomials that is better suited for 
problems in which the number of data points 15 large. 


CALCULUS AND CALCULATING UTILITY REQUIRED 


EXAMPLE 7 Approximate Integration — 


1 2 
sin| =*— | dx 
he] 


directly since there is no way to express an antiderivative of the integrand in terms of elementary functions. 
This integral could be approximated by Simpson's rule or some comparable method, but an alternative 
approach is to approximate the integrand by an interpolating polynomial and integrate the approximating 
polynomial. For example, let us consider the five points 


xo=0, xy=0.25, x2=0.5, x3=0.75, x4=1 


that divide the interval [0, 1] into four equally spaced subintervals. The values of 


f(x) = эш т 


There is no way to evaluate the integral 


at these points are approximately 
J(0)—0, 7(025)-—0.098017, 7(0.5)—0.382685, 7(075)—0.77301, f(1)—1 
The interpolating polynomial is (verify) 


р(х) = 0.098796x + 0.762356х2 + 2.14429x? — 2.00544x4 (15) 
and 
1 
Í p(x) dx 0.438501 (16) 
0 


As shown in Figure 1.8.13, the graphs of fand p match very closely over the interval [0, 1], so the 
approximation is quite good. 
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Figure 1.8.13 


Concept Review 
* Network 

* Branches 

* Nodes 

* Flow conservation 


* Electrical circuits: battery, resistor, poles (positive and negative), electrical potential, Ohm's law, Kirchhoff's 
current law, Kirchhoff's voltage law 


* Chemical equations: reactants, products, balanced equation 

* [nterpolating polynomial 

Skills 

* Find the flow rates and directions of flow in branches of a network. 

* Find the amount of current flowing through parts of an electrical circuit. 
* Write a balanced chemical equation for a given chemical reaction. 


* Find an interpolating polynomial for a graph passing through a given collection of points. 


Exercise Set 1.8 


1. The accompanying figure shows a network in which the flow rate and direction of flow in certain branches are known. 
Find the flow rates and directions of flow in the remaining branches. 
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Figure Ex-1 
Answer: 

50 

40 10 

30 60 

10 50 

40 


2. The accompanying figure shows known flow rates of hydrocarbons into and out of a network of pipes at an oil refinery. 


(a) Set up a linear system whose solution provides the unknown flow rates. 


(b) Solve the system for the unknown flow rates. 
(c) Find the flow rates and directions of flow if x д = 50 and xg = 0. 


200 


Figure Ex-2 


. The accompanying figure shows a network of one-way streets with traffic flowing in the directions indicated. The flow 


rates along the streets are measured as the average number of vehicles per hour. 
(a) Set up a linear system whose solution provides the unknown flow rates. 
(b) Solve the system for the unknown flow rates. 


(c) If the flow along the road from A to B must be reduced for construction, what is the minimum flow that is required 
to keep traffic flowing on all roads? 


Figure Ex-3 


Answer: 


(а) *3—%4= —500, — x; +x4= 100, ху – х2 = 300, x5 – х3 = 100 
(b) %1= — 100 +4, x22 – 400 +2, хз = — 500 +1, x4—t 


(c) For all rates to be nonnegative, we need ¢ = 500 cars per hour, so x, = 400, хэ = 100, хз = 0, хд = 500 


. The accompanying figure shows a network of one-way streets with traffic flowing in the directions indicated. The flow 


rates along the streets are measured as the average number of vehicles per hour. 
(a) Set up a linear system whose solution provides the unknown flow rates. 
(b) Solve the system for the unknown flow rates. 


(c) Is it possible to close the road from A to B for construction and keep traffic flowing on the other streets? Explain. 


300 200 100 
А 
500 A Ж B *2 Y 600 
> › > > © >» 
X5 Ах, Y Ns 
400 450 
«——$——— < è 4 p 
Y X4 
"А Y 
350 600 400 
Figure Ex-4 


In Exercises 5-8, analyze the given electrical circuits by finding the unknown currents. 


5. 


IV+ 


202 


Answer: 


п=ц=5= = 14, h=h=04 


um 

3V ГА 
In Exercises 9-12, write a balanced equation for the given chemical reaction. 
9, C3Hg + Оз —^ CO2 + НО (propane combustion) 


Answer: 

x1=1, x2—5, хз = 3, and x 4 = 4; the balanced equation is C3Hg + 505 — ЗСО + AH9O 
10. C6H1206 — CO2 + C3H5OH ( fermentation of sugar) 
11. CH3COF + НО — CH3;COOH + HF 


Answer: 

x1 =х2 = хз = хд = É; the balanced equation is CH3COF + H20 — СНзСООН + HF 
12. CO2 + H20 — CgH130g + Оз ( photosynthesis) 
13. Find the quadratic polynomial whose graph passes through the points (1, 1), (2, 2), and (3, 5). 


Answer: 


p(x) =x? —2x 42 
14. Find the quadratic polynomial whose graph passes through the points (0, 0), (—1, 1), and (1, 1). 
15. Find the cubic polynomial whose graph passes through the points (—1, —1), (0, 1), (1, 3), (4, -1). 


Answer: 


йй йа 
р(х) = 1+ х ЄХ 


16. The accompanying figure shows ће graph of a cubic polynomial. Find the polynomial. 


10 
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Figure Ex-16 
17. (a) Find an equation that represents the family of all second-degree polynomials that pass through the points (0, 1) and 


(1,2). [Hint: The equation will involve one arbitrary parameter that produces the members of the family when 
varied.] 


(b) By hand, or with the help of a graphing utility, sketch four curves in the family. 
Answer: 


(a) Using a, = Ё as a parameter, p(x) = 1 + kx + (1— k)x? where = со <k < co. 
(b) The graphs for k = 0, 1, 2, and 3 are shown. 


kz3kz2 


18. In this section we have selected only a few applications of linear systems. Using the Internet as a search tool, try to find 
some more real-world applications of such systems. Select one that is of interest to you, and write a paragraph about it. 


True-False Exercises 


In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 
(a) In any network, the sum of the flows out of a node must equal the sum of the flows into a node. 
Answer: 


True 


(b) When a current passes through a resistor, there is an increase in the electrical potential in a circuit. 
Answer: 


False 


(c) Kirchhoffs current law states that the sum of the currents flowing into a node equals the sum of the currents flowing out 
of the node. 


Answer: 


True 


(d) A chemcial equation is called balanced if the total number of atoms on each side of the equation 1s the same. 
Answer: 


False 


(e) Given any n points in the xy-plane, there is a unique polynomial of degree у — 1 or less whose graph passes through 
those points. 


Answer: 


False 
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1.9 Leontief Input-Output Models 


In 1973 the economist Wassily Leontief was awarded the Nobel prize for his work on economic modeling in which he 
used matrix methods to study the relationships between different sectors in an economy. In this section we will discuss 
some of the ideas developed by Leontief. 


Inputs and Outputs in an Economy 


One way to analyze an economy is to divide it into sectors and study how the sectors interact with one another. For 
example, a simple economy might be divided into three sectors—manufacturing, agriculture, and utilities. Typically, a 
sector will produce certain outputs but will require inputs from the other sectors and itself. For example, the agricultural 
sector may produce wheat as an output but will require inputs of farm machinery from the manufacturing sector, 
electrical power from the utilities sector, and food from its own sector to feed its workers. Thus, we can imagine an 
economy to be a network in which inputs and outputs flow in and out of the sectors; the study of such flows is called 
input-output analysis. Inputs and outputs are commonly measured in monetary units (dollars or millions of dollars, for 
example) but other units of measurement are also possible. 


The flows between sectors of a real economy are not always obvious. For example, in World War II the United States had 
a demand for 50,000 new airplanes that required the construction of many new aluminum manufacturing plants. This 
produced an unexpectedly large demand for certain copper electrical components, which in turn produced a copper 
shortage. The problem was eventually resolved by using silver borrowed from Fort Knox as a copper substitute. In all 
likelihood modern input-output analysis would have anticipated the copper shortage. 


Most sectors of an economy will produce outputs, but there may exist sectors that consume outputs without producing 
anything themselves (the consumer market, for example). Those sectors that do not produce outputs are called open 
sectors. Economies with no open sectors are called closed economies, and economies with one or more open sectors are 
called open economies (Figure 1.9.1). In this section we will be concerned with economies with one open sector, and our 
primary goal will be to determine the output levels that are required for the productive sectors to sustain themselves and 
satisfy the demand of the open sector. 


Manufacturing Agriculture 


Figure 1.9.1 


Leontief Model of an Open Economy 


Let us consider a simple open economy with one open sector and three product-producing sectors: manufacturing, 
agriculture, and utilities. Assume that inputs and outputs are measured in dollars and that the inputs required by the 


productive sectors to produce one dollar's worth of output are in accordance with Table 1. 
Table 1 


Income Required per Dollar Output 


Manufacturing | Agriculture| Utilities 


Manufacturing | $ 0.50 $ 0.10 
Provider| Agriculture $ 0.20 $ 0.50 
Utilities $ 0.10 $ 0.30 


Historical Note Itis somewhat ironic that it was the Russian-born Wassily Leontief who won the Nobel prize 
in 1973 for pioneering the modern methods for analyzing free-market economies. Leontief was a precocious 
student who entered the University of Leningrad at age 15. Bothered by the intellectual restrictions of the Soviet 
system, he was put in jail for anti-Communist activities, after which he headed for the University of Berlin, 
receiving his Ph.D. there in 1928. He came to the United States in 1931, where he held professorships at Harvard 
and then New York University. 

[/mage: © Bettmann/OCorbis] 


Usually, one would suppress the labeling and express this matrix as 


05 01 01 
C=/0.2 05 03 (1) 
01 0.3 04 
This is called the consumption matrix (or sometimes the technology matrix) for the economy. The column vectors 
0.5 0.1 0.1 
сп=|0.5|, с2=|0.5|, єз=|0.3 
0.1 0.3 0.4 


in C list the inputs required by the manufacturing, agricultural, and utilities sectors, respectively, to produce $1.00 worth 
of output. These are called the consumption vectors of the sectors. For example, c; tells us that to produce $1.00 worth of 
output the manufacturing sector needs $0.50 worth of manufacturing output, $0.20 worth of agricultural output, and 

$0.10 worth of utilities output. 


What is the economic significance of the row sums 
of the consumption matrix? 


Continuing with the above example, suppose that the open sector wants the economy to supply it manufactured goods, 
agricultural products, and utilities with dollar values: 


d dollars of manufactured goods 
4 dollars of agricultural products 
43 dollars of utilities 


The column vector d that has these numbers as successive components is called the outside demand vector. Since the 
product-producing sectors consume some of their own output, the dollar value of their output must cover their own needs 
plus the outside demand. Suppose that the dollar values required to do this are 


X1 dollars of manufactured goods 
х2 dollars of agricultural products 
X3 dollars of utilities 


The column vector x that has these numbers as successive components is called the production vector for the economy. 
For the economy with consumption matrix 1, that portion of the production vector x that will be consumed by the three 
productive sectors is 


0.5 0.1 0.1 05 0.1 01|Х1 
х1| 0.2 H х2|0.5 + x3|0.3| =|02 05 03 || x2|—Cx 
0.1 0.3 0.4 0.1 0.3 04||^3 


Fractions Fractions Fractions 
consumed by consumed by consumed 


manufacturing agriculture by utilities 


The vector Cx is called the intermediate demand vector for the economy. Once the intermediate demand is met, the 
portion of the production that is left to satisfy the outside demand is x — Cx. Thus, if the outside demand vector is d, then 


x must satisfy the equation 
x — Cx = d 
Amount Intermediate Outside 
produced demand demand 
which we will find convenient to rewrite as 
(1- C)x2d (2) 
The matrix 7 — Œ is called the Leontief matrix and 2 is called the Leontief equation. 


EXAMPLE 1 Satisfying Outside Demand + 


Consider the economy described in Table 1. Suppose that the open sector has a demand for $7900 worth of 
manufacturing products, $3950 worth of agricultural products, and $1975 worth of utilities. 


(a) Can the economy meet this demand? 


(b) If so, find a production vector x that will meet it exactly. 


Solution The consumption matrix, production vector, and outside demand vector are 


05 01 01 х1 7900 
С=|02 05 0.3 |, х= | х2 |, d= | 3950 (3) 
01 03 04 x3 1975 


To meet the outside demand, the vector x must satisfy the Leontief equation 2, so the problem reduces to 
solving the linear system 


05 —01 —01/[х\ 7900 


-02 05 -03|x;| _ |3950 , 
-01 -03 06||хз| = |1975 (4) 
1-С х 4 


(if consistent). We leave it for you to show that the reduced row echelon form of the augmented matrix for 
this system is 


100 27,500 
010 33,750 
001 24,750 
This tells us that 4 is consistent, and the economy can satisfy the demand of the open sector exactly by 


producing $27,500 worth of manufacturing output, $33,750 worth of agricultural output, and $24,750 
worth of utilities output. 


Productive Open Economies 


In the preceding discussion we considered an open economy with three product-producing sectors; the same ideas apply 
to an open economy with n product-producing sectors. In this case, the consumption matrix, production vector, and 
outside demand vector have the form 


CM сїз °° °° Cin xi di 
с c es c x 

бе 2 mn) x=|"?), a=% 
Cal Бий. e бий Хәй d, 


where all entries are nonnegative and 
Ci = the monetary value of the output of the ith sector that is needed by the jth sector to produce one unit of output 
*i = the monetary value of the output of the ith sector 


dj = the monetary value of the output of the ith sector that is required to meet the demand of the open sector 


Remark Note that the jth column vector of C contains the monetary values that the jth sector requires of the other 
sectors to produce one monetary unit of output, and the ith row vector of C contains the monetary values required of the 
ith sector by the other sectors for each of them to produce one monetary unit of output. 


As discussed in our example above, a production vector x that meets the demand d of the outside sector must satisfy the 
Leontief equation 


(-C)x-d 


If the matrix 7 — CŒ 15 invertible, then this equation has the unique solution 

x—(/—C) ld (5) 
for every demand vector d. However, for x to be a valid production vector it must have nonnegative entries, so the 
problem of importance in economics is to determine conditions under which the Leontief equation has a solution with 


nonnegative entries. 


It is evident from the form of 5 that if 7 С is invertible, and if (7 — С) -1 has non-negative entries, then for every 


demand vector d the corresponding x will also have non-negative entries, and hence will be a valid production vector for 
the economy. Economies for which (7 — C^) 71 has nonnegative entries are said to be productive. Such economies аге 


desirable because demand can always be met by some level of production. The following theorem, whose proof can be 
found in many books on economics, gives conditions under which open economies are productive. 


THEOREM 1.9.1 


If C is the consumption matrix for an open economy, and if all of the column sums are less than then the matrix 
1 — C is invertible, the entries of (7 — С) -1 are nonnegative, and the economy is productive. 


Remark The jth column sum of C represents the total dollar value of input that the jth sector requires to produce $1 of 
output, so if the jth column sum is less than 1, then the jth sector requires less than $1 of input to produce $1 of output; in 
this case we say that the jth sector is profitable. Thus, Theorem 1.9.1 states that if all product-producing sectors of an 
open economy are profitable, then the economy is productive. In the exercises we will ask you to show that an open 
economy is productive if all of the row sums of C are less than 1 (Exercise 11). Thus, an open economy is productive if 
either all of the column sums or all of the row sums of C are less than 1. 


EXAMPLE 2 An Open Economy Whose Sectors Are All Profitable — 


The column sums of the consumption matrix C in 1 are less than 1, so (7 — С) -1 exists and has nonnegative 


entries. Use a calculating utility to confirm this, and use this inverse to solve Equation 4 in Example 1. 


Solution We leave it for you to show that 


2.65823 1.13924 1.01266 
(2-С) Z! =| 1.89873 3.67089 2.15190 
1.39241 2.02532 2.91139 


This matrix has nonnegative entries, and 


2.65823 1.13924 1.01266][7900] |27- 500 
x= (1 С) 14 = | 1.89873 3.67089 2.15190 || 3950 | = | 33, 750 
1.39241 2.02532 2.91139 || 1975 | | 24.750 


which is consistent with the solution in Example 1. 


Concept Review 

* Sectors 

* [nputs 

* Outputs 

* [nput-output analysis 
* Open sector 


* Economies: open, closed 


Consumption (technology) matrix 


Consumption vector 


Outside demand vector 


Production vector 


Intermediate demand vector 


Leontief matrix 


Leontief equation 


Skills 
* Construct a consumption matrix for an economy. 


* Understand the relationships among the vectors of a sector of an economy: consumption, outside demand, 
production, and intermediate demand. 


Exercise Set 1.9 


1. 


N 


w 


An automobile mechanic (M) and a body shop (B) use each other's services. For each $1.00 of business that M does, it 
uses $0.50 of its own services and $0.25 of B's services, and for each $1.00 of business that B does it uses $0.10 of its 
own services and $0.25 of M's services. 


(a) Construct a consumption matrix for this economy. 


(b) How much must M and B each produce to provide customers with $7000 worth of mechanical work and $14,000 
worth of body work? 


Answer: 


(а) [0.50 0.25 
0.25 0.10 


(b) [ $ 25, 290 
$ 22, 581 


. Asimple economy produces food (F) and housing (Н). The production of $1.00 worth of food requires $0.30 worth of 


food and $0. 10 worth of housing, and the production of $1.00 worth of housing requires $0.20 worth of food and 
$0.60 worth of housing. 


(a) Construct a consumption matrix for this economy. 


(b) What dollar value of food and housing must be produced for the economy to provide consumers $130,000 worth 
of food and $130,000 worth of housing? 


. Consider the open economy described by the accompanying table, where the input is in dollars needed for $1.00 of 


output. 
(a) Find the consumption matrix for the economy. 


(b) Suppose that the open sector has a demand for $1930 worth of housing, $3860 worth of food, and $5790 worth of 
utilities. Use row reduction to find a production vector that will meet this demand exactly. 


Table Ex-3 


Income Required per Dollar Output 


Housing Food Utilities 


Housing | $0.10 $ 0.60 $ 0.40 
Provider| Food $ 0.30 $ 0.20 $ 0.30 


Utilities | $0.40 $ 0.10 $ 0.20 


Answer: 


(a) [0.1 06 04 
0.3 0.2 03 
04 0.1 02 


(b) [ $31, 500 
$ 26, 500 
$ 26, 300 


4. A company produces Web design, software, and networking services. View the company as an open economy 
described by the accompanying table, where input is in dollars needed for $1.00 of output. 


(a) Find the consumption matrix for the company. 


(b) Suppose that the customers (the open sector) have a demand for $5400 worth of Web design, $2700 worth of 
software, and $900 worth of networking. Use row reduction to find a production vector that will meet this demand 
exactly. 


Table Ex-4 


Income Required per Dollar Output 


Web Design | Software| Networking 


Web Design | $ 0.40 $ 0.20 


Provider| Software $ 0.30 $ 0.35 
Networking | $0.15 $0.10 


In Exercises 5—6, use matrix inversion to find the production vector x that meets the demand d for the consumption 
matrix C. 


5.._|01 3] ‚_ 50 
с= [s ол ЫР] 
Answer: 

123.08 
202.56 

b. 103 01| x. |22 

c- [3 He а=|%| 


7. Consider an open economy with consumption matrix 


(a) Showthat the economy can meet a demand of d} = 2 units from the first sector and @ = 0 units from the second 
sector, but it cannot meet a demand of @ = 2 units from the first sector and 2; = 1 unit from the second sector. 


(b) Give both a mathematical and an economic explanation of the result in part (a). 


8. Consider an open economy with consumption matrix 


Pl Pole Pole 
Aj cafe 1 


If the open sector demands the same dollar value from each product-producing sector, which such sector must 
produce the greatest dollar value to meet the demand? 


9. Consider an open economy with consumption matrix 
C11 C12 
cz, 0 
Show that the Leontief equation x — (7x = d has a unique solution for every demand vector d if c21c15 € 1 = c4. 


10. (a) Consider an open economy with a consumption matrix C whose column sums are less than 1, and let x be the 
production vector that satisfies an outside demand d; that is, (/ С) "d = х. Let d j be the demand vector that is 


obtained by increasing the jth entry of d by 1 and leaving the other entries fixed. Prove that the production vector 
Xj that meets this demand is 


ху =x + jth column vector of (Z — С) = 
(b) In words, what is the economic significance of the jth column vector of (7 — С) = [Hint: Look at X; = X.] 
11. Prove: If C is an у x » matrix whose entries are nonnegative and whose row sums are less than 1, then 7 — ( 15 


-l T 
invertible and has nonnegative entries. [Hint: (47) = (4 1) for any invertible matrix A.] 


True-False Exercises 
In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 
(a) Sectors of an economy that produce outputs are called open sectors. 

Answer: 


False 


(b) A closed economy is an economy that has no open sectors. 
Answer: 


True 


(c) The rows of a consumption matrix represent the outputs in a sector of an economy. 
Answer: 


False 


(d) If the column sums of the consumption matrix are all less than 1, then the Leontif matrix is invertible. 
Answer: 


True 


(e) The Leontif equation relates the production vector for an economy to the outside demand vector. 
Answer: 


True 
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Chapter 1 Supplementary Exercises 


In Exercises 1—4 the given matrix represents an augmented matrix for a linear system. Write the 
corresponding set of linear equations for the system, and use Gaussian elimination to solve the linear system. 
Introduce free parameters as necessary. 


1.12 =1 0 4 1 
2 033 =l] 
Answer 
3x, = X32 + х4 = 1 
2x1 E 3x3 + 3x4 = —1 
EER ccm N P E = 
х|= 5 5! 5 x2 "ul 5t 5° X3-—8S, X4—í 
2 1 4 -1 
—2 —-8 2 
3 12 —3 
0 0 0 
3 2 —4 1 6 
—4 0 —1 
0 1 -1 
Answer: 
2х] = 4х2 + x3 = 6 
—4x| H 3x3 = —1 
х2 = x3 = 3 
EM eee ee ..,35 
х1 = э, х2 3:73 3 
4 3 1 -2 
-9 -3 6 
6 2 1 
5. Use Gauss—Jordan elimination to solve for x’ and y' in terms of x and y. 
23,4 4, 
dmi d 
иза 
= 5” 5 
Answer: 
3.44, „__4,\3 
ымыы ый 


6. Use Gauss-Jordan elimination to solve for x' and y' in terms of x and y. 


x = x'cos 8 — y'sin Ө 
y = x'sin 0 — y'cos 0 
7. Find positive integers that satisfy 
x+ yt z= 9 
x + 5у + 10z = 44 


Answer: 


c= 4) у=—2,2=3 
8. A box containing pennies, nickels, and dimes has 13 coins with a total value of 83 cents. How many coins 
of each type are in the box? 
9. Let 
2 
4 
b 


0 
а 
а 


ORR 


5 moh бт 


be the augmented matrix for a linear system. Find for what values of a and b the system has 
(a) a unique solution. 

(b) a one-parameter solution. 

(c) a two-parameter solution. 


(d) no solution. 
Answer: 


(а) @#0, b 22 
(b) @#9, ё= 2 
(c) 420, b=2 
(d) 2=0, 5 «2 
10. For which value(s) of a does the following system have zero solutions? One solution? Infinitely many 
solutions? 
xi X24 х3 = 4 
хз=2 
(a? —4)x3 =а—2 


11. Find a matrix K such that АЕ B = Œ given that 


A-|-2 3. в-|2 : || 
1 —2 ш 
8 6 —6 


Answer: 


0:2 
a 
12. How should the coefficients a, b, and c be chosen so that the system 
ax + фу = 32 = = 3 
—2x = by +ez= — | 
ax + 3y—-ez— = 3 
has the solution x = 1, y= = 1, and z = 2? 


13. In each part, solve the matrix equation for X. 


(a) —] 0 1 
120 
A TTo -| _; 1 | 


31-1 
6 ,[1 212] [-5 -1 0 
x|; cee 1) 


eL ed JE 3 


Answer: 
(a) ARS -] 3 —1 
| 60 1 
(b) p I 1 -2 
L 1 
(c) -113 _ 160 
37 37 
ce IUS 
37 37 


14. Let A be a square matrix. 
(a) Show that (/ — 4) | =/4 A+ 42 +. A? if 44 — 0. 
(b) Show that 
U—-4)121-4A-4 A? 4...4 А" 
if 4l — 9. 
15. Find values of a, b, and c such that the graph of the polynomial p(x) = ax? =+ bx + c passes through the 
points (1, 2), (—1, 6), and (2, 3). 
Answer: 


a=], b= =2, c=3 
16. (Calculus required) Find values of a, b, and c such that the graph of the polynomial 


17. 


18. 


19. 
20. 
21. 


22. 


23. 


24. 


р(х)= ах? -- bx -c passes through the point (—1, 0) and has a horizontal tangent at (2, —9). 
Let Jn be the » x » matrix each of whose entries is 1. Show that if > 1, then 


(Jn) -1- Jy 


Show that if a square matrix A satisfies 
АЗ 4447-24471 =0 
then so does Д7. 


Prove: If B is invertible, then 487! — 87! 4 if and only if 4B = ВА. 
Prove: If A is invertible, then A + 3 andj + BA -1 are both invertible or both not invertible. 
Prove: If A is ар р; x з matrix and В is the у x | matrix each of whose entries 15 1/n, then 
n 
Ap- |^? 
Pm 


where 7; is the average of the entries in the ith row of A. 


(Calculus required) If the entries of the matrix 
eux) eim) ccc ci) 
Co | 62165) cn) cc 63) 
Cmi lX) Cm2(X) + + + €mu(x) 


are differentiable functions of x, then we define 


eu) eua) 55+ ch) 
dC _|e’an(x) с'аә(х) +++ сә) 
ax : : А : 

cmi (x) e'na(x) "AE e ant) 


Show that if the entries in А and В are differentiable functions of x and the sizes of the matrices are such 


that the stated operations can be performed, then 


d — 24A 
@ E (kA) = ва 


(b) 4 (44 д 44 4 28 
ax TS uc ae 


(© d (ag) = dg , д48 
dx 648) a ae 


(Calculus required) Use part (c) of Exercise 22 to show that 


dA „ТАЙ od] 
== = - А UA 
ax ax 
State all the assumptions you make in obtaining this formula. 


Assuming that the stated inverses exist, prove the following equalities. 


(a) (c^ T gay -C(C 4 D)lp 
(b U--CD) lec-2cu-- Dc) 


(с) (c $ DD") р = cp] + BIN 
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Determinants 


| CHAPTER 


CHAPTER CONTENTS 


2.1. Determinants by Cofactor Expansion 
2.2. Evaluating Determinants by Row Reduction 


2.3. Properties of Determinants; Cramer's Rule 


INTRODUCTION 


In this chapter we will study “determinants” or, more precisely, “determinant functions.” 
Unlike real-valued functions, such as ў (x) = x^, that assign a real number to a real 


variable x, determinant functions assign a real number 7 (4) to a matrix variable A. 
Although determinants first arose in the context of solving systems of linear equations, 
they are no longer used for that purpose in real-world applications. Although they can be 
useful for solving very small linear systems (say two or three unknowns), our main 
interest in them stems from the fact that they link together various concepts in linear 
algebra and provide a useful formula for the inverse of a matrix. 
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2.1 Determinants by Cofactor Expansion 


In this section we will define the notion of a “determinant.” This will enable us to give a specific formula for the inverse of an 
invertible matrix, whereas up to now we have had only a computational procedure for finding it. This, in turn, will eventually 
provide us with a formula for solutions of certain kinds of linear systems. 


Recall from Theorem 1.4.5 that the 2 x 2 matrix 


WARNING 


It is important to keep in mind that det(A) is a number, 
whereas А is a matrix. 


is invertible if and only if 44 — bc # 0 and that the expression gg — bc is called the determinant of the matrix A. Recall also 
that this determinant is denoted by writing 


det(A) — ad —be or d — ad — bc (1) 


a b | 
and that the inverse of A can be expressed in terms of the determinant as 


-1 1 d =b 
A =a 5 1 (2) 


Minors and Cofactors 


One of our main goals in this chapter is to obtain an analog of Formula 2 that is applicable to square matrices of all orders. For 
this purpose we will find it convenient to use subscripted entries when writing matrices or determinants. Thus, if we denote a 


2 х2 matrix as 
211 12 
A= 
p 2 
then the two equations in 1 take the form 


1 


а11 а@12 
det(A) = an а апар - amen (3) 


We define the determinant of a ] x | matrix A= [41] 
as det[.4] = det[a11] —a11 


The following definition will be key to our goal of extending the definition of a determinant to higher order matrices. 


DEFINITION 1 


If A is a square matrix, then the minor of entry ij is denoted by M; j and is defined to be the determinant of the 
submatrix that remains after the ith row and jth column are deleted from А. The number ( — 1) М ij is denoted by 


Су and is called the cofactor of entry ®ў. 


EXAMPLE 1 Finding Minors and Cofactors + 


Let 


WARNING 


We have followed the standard convention of 
using capital letters to denote minors and cofactors 
even though they are numbers, not matrices. 


The minor of entry 411 is 


М = 


The cofactor of 411 is 


Similarly, the minor of entry 232 is 


The cofactor of 432 is 


Cy =(-1) My = — Maj — – 26 


Historical Note The term determinant was first introduced by the German mathematician Carl Friedrich 
Gauss in 1801 (see p. 15), who used them to “determine” properties of certain kinds of functions. 
Interestingly, the term matrix is derived from a Latin word for “womb” because it was viewed as a container 
of determinants. 


Historical Note The term minor is apparently due to the English mathematician James Sylvester (see p. 
34), who wrote the following in a paper published in 1850: “Now conceive any one line and any one column 
be struck out, we get... a square, one term less in breadth and depth than the original square; and by varying 
in every possible selection of the line and column excluded, we obtain, supposing the original square to 
consist of п lines and n columns, 2 such minor squares, each of which will represent what I term a “First 


Minor Determinant” relative to the principal or complete determinant.” 


Remark Note that a minor M; у and its corresponding cofactor C ij are either the same or negatives of each other and that the 


relating si — 1)'*) is either 1 or —] in accordance with the pattern in the “checkerboard” arra: 
g sign + 1 р y 


і і 
і і 


ot iti 


For example, 

Си= Мп, Ca=—-Mn, С2= Мо 
and so forth. Thus, it is never really necessary to calculate ( — 1) +Ј to calculate C'ij—you can simply compute the minor M. ij 
and then adjust the sign in accordance with the checkerboard pattern. Try this in Example 1. 


EXAMPLE 2 Cofactor Expansions of a 2x2 Matrix + 


The checkerboard pattern for a 2 x 2 matrix A= [а;;] is 


so that 
Cy = М = а2 Ср= = Му = а 
Cy = – Му = - ајр Сре Мо =ац 


We leave it for you to use Formula 3 to verify that det(.4) сап be expressed in terms of cofactors in the following 
four ways: 


211 412 


964) = |23 ag 


= 41011 +412612 4) 
—a31C31 + a22022 
—ajC11 t a21C21 
—a12C12 + a22022 


Each of last four equations is called a cofactor expansion of 4еї[ А]. In each cofactor expansion the entries and 
cofactors all come from the same row or same column of А. For example, in the first equation the entries and 
cofactors all come from the first row of A, in the second they all come from the second row of A, in the third they all 
come from the first column of A, and in the fourth they all come from the second column of A. 


Definition of a General Determinant 


Formula 4 is a special case of the following general result, which we will state without proof. 


THEOREM 2.1.1 


If A is an y x 4 matrix, then regardless of which row or column of А is chosen, the number obtained by multiplying the 
entries in that row or column by the corresponding cofactors and adding the resulting products is always the same. 


This result allows us to make the following definition. 


DEFINITION 2 


If A is an x x y matrix, then the number obtained by multiplying the entries in any row or column of A by the 
corresponding cofactors and adding the resulting products is called the determinant of A, and the sums themselves are 
called cofactor expansions of A. That is, 


det(A) = a1;C1; + a2;C2j +... annj 6 
[cofactor expansion along the ;th column] 
and 
det(A) = aj, C51 + ai2012 +... + Gin C in (6) 


[cofactor expansion along the ith row] 


EXAMPLE 3 Cofactor Expansion Along the First Row + 


Find the determinant of the matrix 


з 1 0 
А=|—2 -4 3 
5 4—2 


by cofactor expansion along the first row. 


Solution 
3 1 0 
-4 3 -2 3 -2 —4 
det(A)=|—2 =—4 3| = 3| - | | o| | 
5 4-2 4 —2 5 —2 5 4 


= 3(-4)-(D(-1D040- -1 


EXAMPLE 4 Cofactor Expansion Along the First Column + 


Let A be the matrix in Example 3, and evaluate 4еї( 4) by cofactor expansion along the first column of A. 


Solution 
3 1 0 
-4 3 1 0 1 0 
det(A) = т: 7 E. = 3 4 3|- C »L ELE а 


= 30-4) –(-2)(-2)+5(3) = -1 


Note that in Example 4 we had to compute three 
cofactors, whereas in Example 3 only two were 
needed because the third was multiplied by zero. 
As a rule, the best strategy for cofactor 
expansion is to expand along a row or column 
with the most zeros. 


This agrees with the result obtained in Example 3. 


Charles Lutwidge Dodgson (Lewis Carroll) (1832-1898) 


Historical Note Cofactor expansion is not the only method for expressing the determinant of a matrix 
in terms of determinants of lower order. For example, although it is not well known, the English 
mathematician Charles Dodgson, who was the author of Alice's Adventures in Wonderland and Through 
the Looking Glass under the pen name of Lewis Carroll, invented such a method, called “condensation.” 
That method has recently been resurrected from obscurity because of its suitability for parallel 
processing on computers. 

[Image: Time & Life Pictures/Getty Images, Inc.] 


EXAMPLE 5 Smart Choice of Row or Column «4 


If A is the 4 x 4 matrix 


then to find det(A) it will be easiest to use cofactor expansion along the second column, since it has the most zeros: 
1 0 -1 
det(A)=1-]1 -2 1 
2 0 1 


For the 3 х 3 determinant, it will be easiest to use cofactor expansion along its second column, since it has the most 
Zeros: 


===. “шн . 1 -1 
det(A) = 1- —2 | Д 
= =—2(1+2) 

= –6 


EXAMPLE 6 Determinant of an Upper Triangular Matrix + 


The following computation shows that the determinant of а 4 x 4 upper triangular matrix is the product of its 
diagonal entries. Each part of the computation uses a cofactor expansion along the first row. 


ay 0 0 0 
а an 0 
аз) 432 азз 0 


ay 0 0 
=411|а32 азз 0 


d4) 43 алд 
адр Q4) ад алд 
433 
pii ы ай 


= @11@22@33|@4д| = 2114720330 44 


The method illustrated in Example 6 can be easily adapted to prove the following general result. 


THEOREM 2.1.2 


If A is an » x y triangular matrix (upper triangular, lower triangular, or diagonal), then det(_A) is the product of the 


entries on the main diagonal of the matrix; that is, det(A) = 211222 * * * ag. 


A Useful Technique for Evaluating 2 x 2 and 3 x 3 Determinants 


Determinants of 2 x 2 and 3 x 3 matrices can be evaluated very efficiently using the pattern suggested in Figure 2.1.1. 


Figure 2.1.1 


In the 2 x 2 case, the determinant can be computed by forming the product of the entries on the rightward arrow and 
subtracting the product of the entries on the leftward arrow. In the 3 x 3 case we first recopy the first and second columns as 
shown in the figure, after which we can compute the determinant by summing the products of the entries on the rightward 
arrows and subtracting the products on the leftward arrows. These procedures execute the computations 


WARNING 


The arrow technique only works for determinants of 
2 x 2 апі 3 x 3 matrices. 


aM 412] ш 
ад an| = 41222 412421 
бы = Bun lan ax азу ay а) 422 
21 72 423|= aas а3|— jaz a3 Blaz азу 
азр 432 азз 


= @11(@22433 — 223432) — a12(a21233 — 423231) + a13(221232 — 422431) 
= 411499493 F 012223031 F 213421932 — 013422031 — 412221433 — 211923432 


which agrees with the cofactor expansions along the first row. 


EXAMPLE 7 ATechnique for Evaluating 2 x 2 and 3 x 3 Determinants 


3 l à : 
4 —2 = So = (3)(—2) — (1)(4) = —10 


| 

+ 

wm N 
a 

Il 


= [45 + 84 + 96] — (105 — 48 — 72] = 240 


Concept Review 
* Determinant 

* Minor 

* Cofactor 


* Cofactor expansion 


Skills 

* Find the minors and cofactors of a square matrix. 

* Use cofactor expansion to evaluate the determinant of a square matrix. 

* Use the arrow technique to evaluate the determinant of a 2 x 2 or 3 x 3 matrix. 

* Use the determinant of a 2 x 2 invertible matrix to find the inverse of that matrix. 


* Find the determinant of an upper triangular, lower triangular, or diagonal matrix by inspection. 


Exercise Set 2.1 


In Exercises 1—2, find all the minors and cofactors of the matrix A. 


1 lE d 
A=| 6 7 -1 
=3 -1 4 


Муү=29, Су = 29 
M15;—21, C15 = —21 
M13— 27, C13 = 27 
My, = — 11, Су= 11 
Мэ = 13, Cx = 13 
Мәз = —5, Сз = 5 
Ma, = – 19, C34 = = 19 
Мз) = – 19, C35 — 19 
Mz = 19, C33 = 19 


m 


ы 

I 
о W = 
— We 
„о о го 


Find 

(a) Mis and C33. 
(b) M23 and C55 . 
(с) M22 and C55. 
(d) 21 and C5, . 


Answer: 


(а) M13 —0, Суз = 0 

(b) M23 = — 96, C53 = 96 
(с) М22 = – 48, C35 = – 48 
(d) М2 = 72, Сз = – 12 


4. Let 


Find 

(a) 4432 and C3. 
(b) Мад and Сад. 
(c) Maj and Cy . 
(d) M34 and C54 . 


In Exercises 5-8, evaluate the determinant of the given matrix. If the matrix is invertible, use Equation 2 to find its inverse. 


eu 


Answer: 


11 22 
6.14 
8 
7.|=5 7 
=; —2 
Answer: 
Eu CES 
59 59 
59, 
|1262 295 
59 59 


круг ув 
Е 


In Exercises 9—14, use the arrow technique to evaluate the determinant of ће given matrix. 


9.|a—5 5 
-3 а—2 
Answer: 
a? — 5а + 21 
10.| -2 7 6 
5 1—2 
3.8 
11.| 22 1 
35 —] 
16 2 
Answer 
—65 
12.|=1 1 2 
30 —5 
1 7 2 
13.3 0 0 
2 =l Э 
1 9 —4 
Answer 
—123 
14.|c -4 3 
2 UE 
4 c—-1 2 


In Exercises 15—18, find all values of À for which det(A) = 0. 


а= А7 A 


-5 A+4 
Answer: 
A=lor —3 

16. А—4 0 0 
A= 0 A 2 
0 3 А—1 
17. ,_|A=1 0 
4-| 2 Sd 
Answer: 
A=lor —1 
18. A-—4 4 0 
A-|-1 A 0 
0 0 А—5 


19. Evaluate the determinant of the matrix in Exercise 13 by a cofactor expansion along 
(a) the first row. 
(b) the first column. 
(c) the second row. 
(d) the second column. 
(e) the third row. 
(f) the third column. 


Answer: 


(all parts) — 123 
20. Evaluate the determinant of the matrix in Exercise 12 by a cofactor expansion along 
(a) the first row. 
(b) the first column. 
(c) the second row. 
(d) the second column. 
(e) the third row. 
(f) the third column. 


In Exercises 21—26, evaluate det(.4) by a cofactor expansion along a row or column of your choice. 


21. -3 0 7 
А=| 2 51 
=] 0 5 
Answer: 
—40 
22. 3 3 1 
А= |1 0 —4 
1-3 5 


Answer: 


k+l k—1 7 


k+l К 


10 


2 


Answer: 


In Exercises 27—32, evaluate the determinant of the given matrix by inspection. 


Answer: 


Answer: 


о 


UEIT.) 
TO mr Rt NO 


— сч се © 1 

= со о сус оо 
= ооо ооо 
= - 

БА e 


32. 


33. 


34. 


35. 


36. 


37. 
38. 


39. 


40. 


41. 


Answer: 


6 
-3 0 0 0 
1 2 0 0 
40 10 -1 0 
100 200 —23 3 
Show that the value of the following determinant is independent of 0. 
sin(f cos(@) 0 
=—cos(@) sin(@) 0 


sin(#) = соз(0) sinif) +cos(#) 1 


Answer: 


The determinant is sin^g + соѕ20 = 1. 


Show that the matrices 


commute if and only if 


b а—с 
e d—f i 
By inspection, what is the relationship between the following determinants? 
a b c а+А b c 
4} = d 1 f| and d 22 d 1 f 
g 0 1 g 0 1 
Answer: 
d2-—dj-4-AÀ 
Show that 
1 tr( A) 1 
det(A) = = 
2 (42) tr(A) 


for every 2 x 2 matrix А. 
What can you say about an nth-order determinant all of whose entries are 1? Explain your reasoning. 


What is the maximum number of zeros that a 3 x 3 matrix can have without having a zero determinant? Explain your 
reasoning. 


What is the maximum number of zeros that a 4 x 4 matrix can have without having a zero determinant? Explain your 
reasoning. 


Prove that (x1, ¥1)> G2, Y2) and (хз, уз) are collinear points if and only if 
хр i 1 
x2 уз 1|20 
хз уз 1 


Prove that the equation of the line through the distinct points (a1, 51) and (a3, 2з) can be written as 


х у 1 
ay b 1|=0 
аз b3 1 


42. Prove that if A is upper triangular and By is the matrix that results when the ith row and jth column of A are deleted, then 
Ву is upper triangular if i < j. 


True-False Exercises 


In parts (a)-(1) determine whether the statement is true or false, and justify your answer. 


(a) The determinant of the 2 x 2 matrix j M is ad + bc. 
c 
Answer: 
False 


(b) Two square matrices А and В can have the same determinant only if they are the same size. 
Answer: 


False 


(c) The minor Af ij is the same as the cofactor C ij if and only if i + j is even. 
Answer: 


True 


(d) If A is a 2 3 symmetric matrix, then Č = C ji for all i and j. 
Answer: 


True 


(e) The value of a cofactor expansion of a matrix А is independent of the row or column chosen for the expansion. 
Answer: 


True 


(f) The determinant of a lower triangular matrix is the sum of the entries along its main diagonal. 
Answer: 


False 


(g) For every square matrix A and every scalar c, we have det(c A) =c det(A). 
Answer: 


False 
(h) For all square matrices А and В, we have det(.A + В) = det( A) + det( 5). 


Answer: 


False 


(i) For every 2 x 2 matrix A, we have det(A’) = (det(.4))7. 


Answer: 


True 
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2.2 Evaluating Determinants by Row Reduction 


In this section we will show how to evaluate a determinant by reducing the associated matrix to row echelon form. In 
general, this method requires less computation than cofactor expansion and hence is the method of choice for large 
matrices. 


A Basic Theorem 


We begin with a fundamental theorem that will lead us to an efficient procedure for evaluating the determinant of a square 
matrix of any size. 


THEOREM 2.2.1 


Let А be a square matrix. If A has a row of zeros or a column of zeros, then det(.4) = 0. 


Proof Since the determinant of А can be found by a cofactor expansion along any row or column, we can use the row or 
column of zeros. Thus, if we let C1, C5, ..., Су, denote the cofactors of A along that row or column, then it follows from 
Formula 5 or 6 in Section 2.1 that 


det(A) =0 -C1 +0-Co+...40°C,= 


The following useful theorem relates the determinant of a matrix and the determinant of its transpose. 


THEOREM 2.2.2 


Let А be a square matrix. Then det(A) = det( A7). 


Because transposing a matrix changes its columns to 
rows and its rows to columns, almost every theorem 
about the rows of a determinant has a companion 
version about columns, and vice versa. 


Proof Since transposing a matrix changes its columns to rows and its rows to columns, the cofactor expansion of А 


along any row is the same as the cofactor expansion of AT along the corresponding column. Thus, both have the same 
determinant. 


Elementary Row Operations 


The next theorem shows how an elementary row operation on a square matrix affects the value of its determinant. In 


place of a formal proof we have provided a table to illustrate the ideas in the 3 х 3 case (see Table 1). 


THEOREM 2.2.3 


Let A be an y x у matrix. 


(a) If B is the matrix that results when a single row or single column of A is multiplied by a scalar k, then 
det(5) = & det(A). 


(b) If B is the matrix that results when two rows or two columns of A are interchanged, then det( 8) = — det(A). 


(c) If B is the matrix that results when a multiple of one row of A is added to another row or when a multiple of 
one column is added to another column, then det(5) = det(A). 


The first panel of Table 1 shows that you can bring a 
common factor from any row (column) of a 
determinant through the determinant sign. This is a 
slightly different way of thinking about part (a) of 
Theorem 2.2.3. 


Table 1 
Relationship Operation 
kay, Ка, kay, аң а а The first row of A is 
а, а Gy |=kl] ay, d» ay multiplied by k. 
азу аә dy аз ау d3 


а а а» а а аң The first and second rows 
а ар äpy |= -|4 d» а of A аге interchanged. 
аз, Gy аң аз 4% ад 


det(B) = —det(A) 


а, +ka,, dd ka, a,, + kay, а а аӊ A multiple of the second 
; row of A is added to the 


а а> Ay; = |4) an Gy 
as; 35 33 аз аз) ay first row. 


det(B) — det(A) 


We will verify the first equation in Table 1 and leave the other two for you. To start, note that the determinants on the two 
sides of the equation differ only in the first row, so these determinants have the same cofactors, C'11, C43, C13, along that 
row (since those cofactors depend only on the entries in the second two rows). Thus, expanding the left side by cofactors 
along the first row yields 


kay, kai? kay 
а 422 an| = hay C11 + kaili + kasaC 13 
431 432 433 

= k(a11C11 ++ a12C12 + a33C13) 


211 @12 d13 
=k)/421 422 423 
431 432 233 


Elementary Matrices 


It will be useful to consider the special case of Theorem 2.2.3 in which А = 7, is the › x р identity matrix and Е (rather 
than В) denotes the elementary matrix that results when the row operation is performed on /,,. In this special case 
Theorem 2.2.3 implies the following result. 


THEOREM 2.2.4 


Let E be an у x y elementary matrix. 

(a) If E results from multiplying a row of /,, by a nonzero number k, then det(E) = Ж. 
(b) ІЕЕ results from interchanging two rows of /,,, then det(E) = — 1. 

(c) ІЕЕ results from adding a multiple of one row of /,, to another, then det( E) = 1. 


EXAMPLE 1 Determinants of Elementary Matrices — 


The following determinants of elementary matrices, which are evaluated by inspection, illustrate Theorem 
2.2.4. 


Observe that the determinant of an elementary 
matrix cannot be zero. 


о о н 


0 
1 
0 
0 0 
The second row of 74 The firstandlastrows of 7 times the last row of Z4 
was multipliedhy3. — f4wereinterchamged. was added io the first row. 


Matrices with Proportional Rows or Columns 


If a square matrix A has two proportional rows, then a row of zeros can be introduced by adding a suitable multiple of one 


of the rows to the other. Similarly for columns. But adding a multiple of one row or column to another does not change 
the determinant, so from Theorem 2.2.1, we must have det(A) = 0. This proves the following theorem. 


THEOREM 2.2.5 


If A is a square matrix with two proportional rows or two proportional columns, then det(.4) = 0. 


EXAMPLE 2 Introducing Zero Rows + 


The following computation shows how to introduce a row of zeros when there are two proportional rows. 


1З 24 ] 3 -24 The second row is 2 times the 
26-48| 00 00| 0 first, so we added —2 times 
3:5 15 39 15 the first row to the second to 
11 48 11 48 introduce a row of zeros . 
Each of the following matrices has two proportional rows or columns; thus, each has a determinant of zero. 
3 =l 4 —5 
ee Ee 6-2 5 2 
—2 8 > 4 3 5 8 1 4 


—9 3 —12 15 


Evaluating Determinants by Row Reduction 


We will now give a method for evaluating determinants that involves substantially less computation than cofactor 
expansion. The idea of the method is to reduce the given matrix to upper triangular form by elementary row operations, 
then compute the determinant of the upper triangular matrix (an easy computation), and then relate that determinant to 
that of the original matrix. Here is an example. 


EXAMPLE 3 Using Row Reduction to Evaluate a Determinant + 


Evaluate det(.4) where 


Solution We will reduce A to row echelon form (which is upper triangular) and then apply Theorem 
2:1.2; 


Even with today's fastest computers it would 
take millions of years to calculate a 25 x 25 
determinant by cofactor expansion, so 


methods based on row reduction are often 
used for large determinants. For determinants 
of small size (such as those in this text), 
cofactor expansion is often a reasonable 


choice. 
det( 4) — н : : _ 3 -6 3 _ The first and second rows of 
== Е zin AE. ^ Awhere interchanged . 
2 61 2 6 1 
1—23 А common factor of 3 from 
= —3l0 1 5| — the first rowwas taken 
2 61 through the determinant sign . 
— | E : - —2 times the first row was 
0 10 —5 . added to the third row. 
ae : E : __ ^10 times the second row 
0 0 —55 J was added to the third row. 
1-23 А common factor of — 55 
—(—3)(—55)|0 1 5 «— from the last row was taken 
0 0 1 through the determinant sign . 


= (—3)(—55)(1) = 165 


EXAMPLE 4 Using Column Operations to Evaluate a Determinant — 


Compute the determinant of 


Solution This determinant could be computed as above by using elementary row operations to reduce A to 
row echelon form, but we can put A in lower triangular form in one step by adding —3 times the first column 
to the fourth to obtain 


det(A) = det 


0 0 
0 0|. m 
з o| 7 (0000(—726) = — 546 
1 


1 
2 
0 
7 


Example 4 points out that it is always wise to keep 
an eye open for column operations that can shorten 


computations. 


Cofactor expansion and row or column operations can sometimes be used in combination to provide an effective method 
for evaluating determinants. The following example illustrates this idea. 


EXAMPLE 5 Row Operations and Cofactor Expansion <4 


Evaluate det(.4) where 


aM t^ 
= 
Ww л O 


Solution By adding suitable multiples of the second row to the remaining rows, we obtain 
0 —1 13 

1 2 -1 1 

0 0 3 3 

0 8 0 


det(A) 


«— Cofactor expansion along the first column . 


Il 

| 

© 
со UJ) = 


«— We added the first row to the third row . 


=-(-1) 


Skills 

* Know the effect of elementary row operations on the value of a determinant. 

* Know the determinants of the three types of elementary matrices. 

* Know how to introduce zeros into the rows or columns of a matrix to facilitate the evaluation of its determinant. 
* Use row reduction to evaluate the determinant of a matrix. 

* Use column operations to evaluate the determinant of a matrix. 


* Combine the use of row reduction and cofactor expansion to evaluate the determinant of a matrix. 


Exercise Set 2.2 


In Exercises 1—4, verify that det(.4) = det(A 2 


1 =? 3 
‘A= 


5.|1 0 0 0 
0 1 0 0 
00-50 
0 0 0 1 
Answer 
—5 

6. 100 

0 10 
—5 0 1 

7.11000 
0010 
0100 
0001 
Answer 
=] 

8.|1 00 0 

US 
0 3 0 0 
0 0 10 
0 00 1 

9.110 0 0 
010 —9 
00 1 0 
000 1 
Answer: 


1 


In Exercises 10—17, evaluate the determinant of the given matrix by reducing the matrix to row echelon form. 


10.| 3 6 —9 
00 —2 
—2 1 5 

11. 


0 
1 
E 


Answer: 


5 
12 1 -3 
-2 4 1 
5 -22 
13 3 -6 9 
-2 7 -2 
0 1 5 
Answer: 
33 
14 1 -2 3 1 
5 —9 6 3 
—1 2 = —2 
2 8 6 1 
15.|2 1 3 1 
101 1 
02 10 
012 3 
Answer: 
6 
16. Oc ШЫ a4 
1 ug 
2 2 1 2 
2 1 1 
3 °з 3 А 
12 
-3 3 0 0 
17. 1 3 1 305 
—2 =; 0-4 2 
0 0 1 0 1 
0 0 2 [SE 
0 0 0 1 1 
Answer: 
—2 


18. Repeat Exercises 10—13 by using a combination of row reduction and cofactor expansion. 


19. Repeat Exercises 14—17 by using a combination of row operations and cofactor expansion. 
Answer: 
Exercise 14: 39; Exercise 15: 6; Exercise 16: -4; Exercise 17: —2 


In Exercises 20—27, evaluate the determinant, given that 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


б- © x 
V 


72 


g h i 
a+g b+h c+i 
d e F 


g h i 
Answer: 
—6 

2d 2e 2f 
Е+ За h+3b i+ Зс 


—3a —3b —3c 
d e Jd 
g—4d h—4e i—4f 


Answer: 


18 
Show that 


det) 0 аз azn |= — 413422431 


œ [0 0 O0 ayy 
del 0 © 223 424 


= 21402343224] 
232 933 934 
G4] 44 G43 а 
29. Lx л] 
Use row reduction to show that |a è c|—(b—a)(c—a)(c— b) 
a? à? C 


In Exercises 30—33, confirm the identities without evaluating the determinants directly. 


30. |21 +212 az +b аз + bat a, dj аз 
ai--bi agt--b3 аз з= (1—£2)|b1. b2 b3 
c] €2 €3 6] c2 €3 


31. |а] Ёр ay +oy+ey а] by с] 
аз b3 az +03 +с2| = |a3 b3 c3 
аз b3 аз +03 +63 аз b3 c3 

32. Ja, by ѓар ср sa a, 42 аз 
аз batia сэ + rba + sa| = |b] b3 b3 
аз Ёз ѓаз сз trba + saz С] c2 63 


33. |а] +2] ар = #2] с ay bj ĉl 
а) +) az= b3 c3|— —2|a3 42 c3 
аз Рз аз = 0з c3 аз з ca 


34. Find the determinant of the following matrix. 


a bbb 
ba bb 
bbab 
bbba 


In Exercises 35—36, show that det(A) = 0 without directly evaluating the determinant. 


35. —2 8 P 4 
3 25 4 
a= 1 106 5 
4 —6 4 —3 
36 -4 1 1 1 
1 -4 1 1 
A=| 1 -4 1 1 
1 -4 1 
1 1 1 1-4 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 


(a) If A is a 4 x 4 matrix and В is obtained from A by interchanging the first two rows and then interchanging the last two 
rows, then det( 9) = det(A). 


Answer: 


True 


(b) If A is a 3 x 3 matrix and В is obtained from А by multiplying the first column by 4 and multiplying the third column 
by 2, then det(B) = 3 det(.4). 


Answer: 


True 


(c) If A is a 3 x 3 matrix and В is obtained from A by adding 5 times the first row to each of the second and third rows, 
then det(5) = 25 det(A). 


Answer: 


False 


(d) If A is an » x; » matrix and B is obtained from А by multiplying each row of A by its row number, then 


det(B) = LLL det(A) 


Answer: 


False 


(e) I£ A is a square matrix with two identical columns, then det(.4) = 0. 
Answer: 


True 


(f) If the sum of the second and fourth row vectors of a 6 x 6 matrix А is equal to the last row vector, then det(.4) = 0. 
Answer: 


True 
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2.3 Properties of Determinants; Cramer's Rule 


In this section we will develop some fundamental properties of matrices, and we will use these results to derive a 
formula for the inverse of an invertible matrix and formulas for the solutions of certain kinds of linear systems. 


Basic Properties of Determinants 


Suppose that А and B are у; x » matrices and k is any scalar. We begin by considering possible relationships 
between det(A), det( 9), and 

det(k.4),  det(.A-- B), and det(AB) 
Since a common factor of any row of a matrix can be moved through the determinant sign, and since each of the 
n rows in i; 4A has а common factor of k, it follows that 


det(k.4) =k" det(A) (1) 
For example, 
kajj kay kay3 ау] 412 d13 
Казр kan kan —k^aa an ag 


Unfortunately, no simple relationship exists among det(A), det(5), and det(.A + 2). In particular, we emphasize 
that det( А + 3) will usually not be equal to det(A) + det( E). The following example illustrates this fact. 


EXAMPLE 1 det(A + B) det(A) + дев) + 


1 2 3 1 4 3 
4c sp Pali sh 4+#=[; s 
We have det(.4) = 1, det( 9) = 8, and det( A + В) = 23; thus 
det(.A + B) # det(A) + det( 5) 


Consider 


In spite of the previous example, there is a useful relationship concerning sums of determinants that is applicable 
when the matrices involved are the same except for one row (column). For example, consider the following two 
matrices that differ only in the second row: 


a1, 412 211 12 
а= |а | = в | 


Calculating the determinants of А and В we obtain 


det(A) + det(S) = (a11222 — 212221) + (911222 — a12521) 
—a11(222 + 222) — a12(a21 + 221) 


ay} a12 
= det 
az + Рә 222 + 422 


"m 411 412 be det а11 12 da 411 212 
е Е = 
421 an by, b3 “aa tba аз + 522 


This is a special case of the following general result. 


Thus 


THEOREM 2.3.1 


Let А, B, and C be » x » matrices that differ only in a single row, say the rth, and assume that the rth row 
of C can be obtained by adding corresponding entries in the rth rows of 4 and B. Then 


det(C') = det(A) + det(B) 


The same result holds for columns. 


EXAMPLE 2 Sums of Determinants + 


We leave it to you to confirm the following equality by evaluating the determinants. 


1 7 5 175 17 5 
det| 2 0 3 —det|2 0 3|--det| 2 0 3 
1+0 4+1 7+(—1) 147 01 -1 


Determinant of a Matrix Product 


Considering the complexity of the formulas for determinants and matrix multiplication, it would seem unlikely 
that a simple relationship should exist between them. This is what makes the simplicity of our next result so 
surprising. We will show that if A and В are square matrices of the same size, then 


det( AB) = det(.4) det(B) 2) 


The proof of this theorem is fairly intricate, so we will have to develop some preliminary results first. We begin 
with the special case of 2 in which А is an elementary matrix. Because this special case is only a prelude to 2, we 
call it a lemma. 


LEMMA 2.3.2 


If B is an » x » matrix and £ is an » x » elementary matrix, then 


det(EB) = det(E) det(B) 


Proof We will consider three cases, each in accordance with the row operation that produces the matrix Е. 


Case 1 If E results from multiplying a row of /,, by k, then by Theorem 1.5.1, Eg results from B by multiplying 
the corresponding row by k; so from Theorem 2.2.3(a) we have 
ае EB) = & det( 5) 
But from Theorem 2.2.4(a) we have det( E) = Ж, so 
ае EB) = det(#) det( 5) 


Case 2 and 3 The proofs of the cases where Е results from interchanging two rows of /,, or from adding a 
multiple of one row to another follow the same pattern as Case 1 and are left as exercises. 


Remark It follows by repeated applications of Lemma 2.3.2 that if B is an y x у matrix and #1, #3, ..., E, are 
3 x n elementary matrices, then 


det(ELE5...E,B) = det(E4) det(E5).. det(E,) det(B) (3) 


Determinant Test for Invertibility 


Our next theorem provides an important criterion for determining whether a matrix is invertible. It also takes us a 
step closer to establishing Formula 2. 


THEOREM 2.3.3 


A square matrix A is invertible if and only if det(.4) # 0. 


Proof Let R be the reduced row echelon form of A. As a preliminary step, we will show that det(A) and det( £) 
are both zero or both nonzero: Let #1, E5,..., E, be the elementary matrices that correspond to the elementary 
row operations that produce R from A. Thus 


В=Е,- - -BFA 


and from 3, 


det(R) = det(E,) - - - det(Ez) ае) det(A) (4) 


We pointed out in the margin note that accompanies Theorem 2.2.4 that the determinant of an elementary matrix 
is nonzero. Thus, it follows from Formula 4 that det(.4) and det( &) are either both zero or both nonzero, which 
sets the stage for the main part of the proof. If we assume first that A is invertible, then it follows from Theorem 
1.6.4 that 5 — j and hence that det(&) = 1 ( # 0). This, in turn, implies that det(.4) # 0, which is what we 
wanted to show. 


It follows from Theorems 2.3.3 and Theorem 
2.2.5 that a square matrix with two proportional 
rows or two proportional columns is not 
invertible. 


Conversely, assume that det(.4) # 0. It follows from this that det( R} # 0, which tells us that R cannot have a row 
of zeros. Thus, it follows from Theorem 1.4.3 that д — 7 and hence that А is invertible by Theorem 1.6.4. 


EXAMPLE 3 Determinant Test for Invertibility + 


Since the first and third rows of 


are proportional, det(A) = 0. Thus А is not invertible. 


We are now ready for the main result concerning products of matrices. 


THEOREM 2.3.4 


If A and B are square matrices of the same size, then 


det(.AB) = det(.4) det(B) 


Proof We divide the proof into two cases that depend on whether or not A is invertible. If the matrix А is not 
invertible, then by Theorem 1.6.5 neither is the product AB. Thus, from Theorem Theorem 2.3.3, we have 
det(. A5) = 0 and det(.4) = 0, so it follows that det(.45) = det(A) det( 5). 


Augustin Louis Cauchy (1789—1857) 


Historical Note In 1815 the great French mathematician Augustin Cauchy published a landmark paper 
in which he gave the first systematic and modern treatment of determinants. It was in that paper that 
Theorem 2.3.4 was stated and proved in full generality for the first time. Special cases of the theorem had 
been stated and proved earlier, but it was Cauchy who made the final jump. 

[[таге: The Granger Collection, New York] 


Now assume that A is invertible. By Theorem 1.6.4, the matrix A is expressible as a product of elementary 
matrices, say 


А= ЕЕ: +E 


50 
АВ = Еу e - E,B 
Applying 3 to this equation yields 
det(.AB) = det(E4)det(E5) + - - det(E,) дед) 
and applying 3 again yields 
det(.AB) = det(E4E5- - - E,)det(5) 
which, from 5, can be written as det( 45) = det(.A) det( 5). 


EXAMPLE 4 Verifying That det(AB) = det(A), det(B) <4 


3 1 -1 3 2 d7 
а=; | в 5 2} 4-[; A 
We leave it for you to verify that 


det(A) = 1, ед) = —23, and det( AB) = —23 
Thus det( A5) = det(A) дед), as guaranteed by Theorem 2.3.4. 


Consider the matrices 


The following theorem gives a useful relationship between the determinant of an invertible matrix and the 
determinant of its inverse. 


(5) 


THEOREM 2.3.5 


If A is invertible, then 


det( A7!) = FT 


Proof Since 47! 4 =], it follows that det(.4 1.4) = det(Z). Therefore, we must have det(A I )det(.4) = 1. 
Since det( 4) # 0, the proof can be completed by dividing through by det(A). 


Adjoint of a Matrix 


In a cofactor expansion we compute det(A) by multiplying the entries in a row or column by their cofactors and 
adding the resulting products. It turns out that if one multiplies the entries in any row by the corresponding 
cofactors from a different row, the sum of these products is always zero. (This result also holds for columns.) 
Although we omit the general proof, the next example illustrates the idea of the proof in a special case. 


It follows from Theorems 2.3.5 and 2.1.2 that 
det(Al) = 1...1 
d] 422 ayy 
Moreover, by using the adjoint formula it is 
possible to show that 
а а 1 
а1° 42° 7” äm 
are actually the successive diagonal entries of 
AT! (compare А and 4^1 in Example 3 of 


Section 1.7 ). 


EXAMPLE 5 Entries and Cofactors from Different Rows + 


Let 
011 012 d13 
А= |421 422 423 
031 432 233 
Consider the quantity 
a11C31 + 212632 + 213033 
that is formed by multiplying the entries in the first row by the cofactors of the corresponding entries 


in the third row and adding the resulting products. We can show that this quantity is equal to zero by 
the following trick: Construct a new matrix A" by replacing the third row of A with another copy of the 


first row. That is, 


411] 412 813 
А! = |а an 223 
411 412 813 
Геї С | Су, Cha be the cofactors of the entries in the third row of A”. Since the first two rows of A 


and А' are the same, and since the computations of Сз, Сз, C33. C^ : Суу, апа C33 involve only 
entries from the first two rows of A and A’, it follows that 


Сз = Сз, Сз = Сз, Сзз = Сз 


Since 4" has two identical rows, it follows from 3 that 
det(A") = 0 (6) 
On the other hand, evaluating det(A") by cofactor expansion along the third row gives 
det(A") = a11C5, --a12055 + a13033 = а Сз + а12Сз2 + a13C33 (7) 


From 6 and 7 we obtain 
a11C31 + a12032 + а13С33 = 0 


DEFINITION 1 


If A is any у x 4 matrix and Č ij 15 the cofactor of 415, then the matrix 


Сур Ci; .. Cin 
Ca C233 .. Cay 


Сы Cy e Сум 


is called the matrix of cofactors from A. The transpose of this matrix is called the adjoint of A and is 
denoted by adj(A). 


EXAMPLE 6 Adjoint of a 3 x 3 Matrix + 


Let 
3 2 =] 
А= |1 6 3 
2 —4 0 


The cofactors of А are 
C11—12 Cy=6 C13— = 16 
C34424 C322 C23 = 16 
C41 = 12 C35— —10 Сзз = 16 


so the matrix of cofactors is 


12 6 —16 


4 2 16 
12 —10 16 
and the adjoint of A is 
12 4 12 
adj(.4) — 6 2 —10 
—16 16 16 


Leonard Eugene Dickson (1874—1954) 


Historical Note The use of the term adjoint for the transpose of the matrix of cofactors appears to have 


been introduced by the American mathematician L. E. Dickson in a research paper that he published in 
1902. 


Umage: Courtesy of the American Mathematical Society] 


In Theorem 1.4.5 we gave a formula for the inverse of a 2 x 2 invertible matrix. Our next theorem extends that 
result to x x x invertible matrices. 


THEOREM 2.3.6 Inverse of a Matrix Using Its Adjoint 


If A is an invertible matrix, then 


(8) 


Proof We show first that 


А adj(A) = det(.4)7 


Consider the product 


айп 912... а 


а 92 ... аһ || Си Ca... Ci ... Cn 

А : : Ci Cn ... Со ... Cm 

A adj(A) = 22 i n2 
dil ар «++ din . ' : 

: : Cin Co, tee C jn ose Cun 


dnl n2? -+- Unn 


The entry in the ith row and jth column of the product A adj{ 4) is 
(see the shaded lines above). 


If = j, then 9 is the cofactor expansion of det( 4) along the ith row of A (Theorem 2.1.1), and if # j, then the 
a's and the cofactors come from different rows of А, so the value of 9 is zero. Therefore, 


det(A) 0 - 0 
А adj( 4) = : Set Е : — det(.4)7 (10) 
0 0 ... det(A) 


Since A is invertible, det(.4) # 0. Therefore, Equation 10 can be rewritten as 


1 . _ 
det(A) [A adj(A}] =/ or A ay det(A) dia | - 1 
Multiplying both sides on the left by 47! yields 


zu 
39 3 — 1 C804) 


EXAMPLE 7 Using the Adjoint to Find an Inverse Matrix — 
Use 8 to find the inverse of the matrix A in Example 6. 


Solution We leave it for you to check that det(.4) = 64. Thus 


12 4 12 

Ж [252mm 77 о 
xn deu 90 “ы b Р E =| $4 64 64 

= .16 16 16 

64 64 64 


Cramer's Rule 


Our next theorem uses the formula for the inverse of an invertible matrix to produce a formula, called Cramer's 


rule, for the solution of a linear system 4x = h ofn equations in n unknowns in the case where the coefficient 
matrix A is invertible (or, equivalently, when деї(4) # 0). 


THEOREM 2.3.7 Cramer's Rule 


If Ax = h is a system of n linear equations in n unknowns such that det(A) # 0, then the system has а 
unique solution. This solution is 


_ det( A1) РЕ det(.45) = Че А„һ) 
~ det(A) ' 2 det(A) °°" TO det(A) 
where Aj is the matrix obtained by replacing the entries in the jth column of А by the entries in the matrix 
by 
b=|°2 
by 


Proof If det(.4) # 0, then A is invertible, and by Theorem 1.6.2, x — 4^] is the unique solution of 4x = b. 
Therefore, by Theorem 2.3.6 we have 


Cu Ca -.. Сы ||Ф\ 
x-A pol adus Lem Cm Cra |22 


det(A) det(A) : : 
Cin Cay --- Сум || 2и 
Multiplying the matrices out gives 
51С + 52C21 +... 4+ EnEn 
— b1C12 + 22622 +... + EnEn 
det( 4) : : : 
BC in + 202, +... Б, Суп 
The entry in Ше jth row of x is therefore 
b1 C1 +2262 +... + by Cy 
ху= (11) 
det(A) 
Now let 
aij 412 .. @у—| Ёр айр -. 1n 
Apa] 421 422 .. G2j-1 B2 9)41 с аи 
с Ашы e" : А А А à 
йу] n2 --- Gwi-l by Ф331 --- Ann 


Since A; differs from A only in the jth column, it follows that the cofactors of entries b1, b2, .... By in Aj are the 


same as the cofactors of the corresponding entries in the jth column of A. The cofactor expansion of det(.A;) 
along the jth column is therefore 


Substituting this result in 11 gives 
"- det(A;) 
J det(A) 


EXAMPLE 8 Using Cramer's Rule to Solve a Linear System + 


Use Cramer's rule to solve 


x1 + + 2x3 = 6 
—3х| + 4х2 + 6x3 = 30 
=x] = 2x3 + 3x3 = 8 


Gabriel Cramer (1704—1752) 


Historical Note Variations of Cramer's rule were fairly well known before the Swiss 
mathematician discussed it in work he published in 1750. It was Cramer's superior notation 
that popularized the method and led mathematicians to attach his name to it. 

[[таге: Granger Collection] 


Solution 
1 0 2 6 02 
A—-|-3 4 6| 44-2|30 4 6| 
-] -2 3 8 -2 3 
1 6 2 1 0 6 
422|-23 30 6|, Аз=|—3 4 30 
—1 8 3 -] -2 8 


For n > 3, it is usually more efficient to 
solve a linear system with n equations in n 
unknowns by Gauss-Jordan elimination 
than by Cramer's rule. Its main use is for 
obtaining properties of solutions ofa 
linear system without actually solving the 
system. 


Therefore, 


de(4) — 44 11' 7-7 аА) 44 117 
. det(43) _ 152 38 


de(4) — 44 11 


Equivalence Theorem 


In Theorem 1.6.4 we listed five results that are equivalent to the invertibility of a matrix А. We conclude this 
section by merging Theorem 2.3.3 with that list to produce the following theorem that relates all of the major 
topics we have studied thus far. 


THEOREM 2.3.8 Equivalent Statements 


If A is an » x » matrix, then the following statements are equivalent. 
(a) A is invertible. 

(b) Ax — Q has only the trivial solution. 

(c) The reduced row echelon form of A is 7,,. 

(d) A can be expressed as a product of elementary matrices. 

(e) Ах = h is consistent for every » x | matrix b. 

(f) Ах = has exactly one solution for every » x 1 matrix b. 


(g) det(.4) #0. 


OPTIONAL 


We now have all of the machinery necessary to prove the following two results, which we stated without proof in 
Theorem 1.7.1: 


* Theorem 1.7.1(c) A triangular matrix is invertible if and only if its diagonal entries are all nonzero. 


* Theorem 1.7.1(d) The inverse of an invertible lower triangular matrix is lower triangular, and the inverse of an 
invertible upper triangular matrix is upper triangular. 


Proof of Theorem 1.7.1(c) Let A= [aj] be a triangular matrix, so that its diagonal entries are 


411, 022, -- дум 
From Theorem 2.1.2, the matrix А is invertible if and only if 
det(A) = 211422 * * `@уы 


is nonzero, which is true if and only if the diagonal entries are all nonzero. 


Proof of Theorem 1.7.1(d) We will prove the result for upper triangular matrices and leave the lower 
triangular case for you. Assume that А is upper triangular and invertible. Since 


At= == 5 — —adj(.4) 


we can prove that 41 is upper triangular by showing that adj(.4) is upper triangular or, equivalently, that the 
matrix of cofactors is lower triangular. We can do this by showing that every cofactor C ij with? < j (1.е., above 
the main diagonal) is zero. Since 


Cy - (7 0! Mg 


it suffices to show that each minor Af ij With i < j is zero. For this purpose, let В be the matrix that results when 
the ith row and jth column of A are deleted, so 


Му = еВ) (12) 


From the assumption that i < |, it follows that Bij is upper triangular (see Figure Figure 1.7.1). Since А is upper 
triangular, its (i + 1)-st row begins with at least i zeros. But the ith row of Ej; is the (i + 1)-st row of A with the 
entry in the jth column removed. Since i < j, none of the first і zeros is removed by deleting the jth column; thus 
the ith row of В starts with at least 7 zeros, which implies that this row has a zero on the main diagonal. It now 
follows from Theorem 2.1.2 that det(5;;) = 0 and from 12 that Mj; = 0. 


Concept Review 

* Determinant test for invertibility 
* Matrix of cofactors 

* Adjoint of a matrix 

* Cramer's rule 


* Equivalent statements about an invertible matrix 


Skills 


* Know how determinants behave with respect to basic arithmetic operations, as given in Equation 1, 
Theorem 2.3.1, Lemma 2.3.2, and Theorem 2.3.4. 


* Use the determinant to test a matrix for invertibility. 


* Know how det(A) and det( A7!) are related. 


* Compute the matrix of cofactors for a square matrix A. 
* Compute adj(.4) for a square matrix A. 
* Use the adjoint of an invertible matrix to find its inverse. 


* Use Cramer's rule to solve linear systems of equations. 


* Know the equivalent characterizations of an invertible matrix given in Theorem 2.3.8. 


Exercise Set 2.3 


In Exercises 1—4, verify that det(k.4) = k” dett A). 


2 2 2 
‘A= = =—4 
[5 2) 
3. 2 -1 3 
A=|3 2 1|; 4=-2 
1 4 5 
4. 1 1 1 
А=|02 3 k=3 
01 —2 


In Exercises 5—6, verify that det( A5) = det( £A) and determine whether the equality 
det(.A + В) = det(A) + det( 5) holds. 


5. 210 ї 13 

A=|3 4 0| and В=|7 12 
002 5 01 

6 al 2 sl =4 

A= 0 -1| ad B=|1 1 3 
-2 2 0 3-1 


In Exercises 7-14, use determinants to decide whether the given matrix is invertible. 
7. 2 55 
А=|—1 —1 0 

2 4 3 
Answer: 


Invertible 


8. 2 0 3 
А= 0 3 2 
-2 0 -4 
9. 2-3 5 
A=|0 1 -3 
0 0 2 
Answer: 
Invertible 
10. =3 0 1 
A= 5 0 6 
8 0 3 
11. 4 2 8 
A-|-2 1 -4 
3 1 6 
Answer: 
Not invertible 
12. 1 0 -1 
A=|9 =] 4 
8 9 —1 
13. 


2 0 0 
A=| 810 
—5 3 6 


Answer: 
Invertible 
14. V2 -/7 0 
A=|3f2 –зұт 0 
5 —9 0 


In Exercises 15—18, find the values of k for which A is invertible. 


15. ,. |k-3 -—2 
а= |+”, ES 


Answer: 


5017 
k£ 5 


16. , [k 2 
4i] 


ы 
| 
а Wo 
UJ кє. [о 
= 


ы 

| 
о omo 
м — м 
— o 


In Exercises 19—23, decide whether the given matrix 1 invertible, and if so, use the adjoint method to find its 
inverse. 


19. 


Answer: 
3 = —5 
At=|-3 4 5 
2 -2 -3 
20. 22:0" <3 
А=| 03 2 
-2 0 —4 
21. 2 =3 5 
A=|0 1 -3 
0 0 2 
Answer: 
1 3 
22 ! 
E 3 
Аз=|0 15 
d. 
0 0 3 
22. 200 
А=| 810 
-5 3 6 
23. 131 1 
2 «2-2 
4=| 1389 
1322 


E à p uu 
Ae 

Е dg 

6 0 a7 


In Exercises 24—29, solve by Cramer's rule, where it applies. 


24, 7x1 = 2x3 = 3 
3x1 + x2 = 5 
25. 4x + 5y 
lix + y+ 2 
x + Sy + 22 
Answer 
26. X = 4y + 27 6 
4x = у + 22 = —1 
2x + 2у = Zz —20 
27. X = 3x2 + x3 = 4 
2x, = x3 = =2 
4х1 — 3х3 = 0 
Answer: 
ТЕСНЕЕ M 
28. =x] = 4х2 + 2x3 + x4 = -—32 
2x1 — хз + 7х3 + 9х4 = 14 
=x] + x2 + 3x3 + x4 = 11 
хро = 2x3 + x3 = 44 = —4 
29. 3x, = X2 + x3 4 
=x; + 7x3 — 2x3 = 1 
2x1 + бхз — x3 = 5 
Answer: 


Cramer's rule does not apply. 


30. Show that the matrix 


cos sin 0 
А= | -—sinÜ созӣ 0 
0 0 1 


is invertible for all values of 0; then find 4-1 using Theorem 2.3.6. 


31. Use Cramer's rule to solve for y without solving for the unknowns x, z, and w. 


3x + Fy = z + w= 1 
Tx + Зу — 5 + 8w = —3 
x 4 y 4 2 + 2w = 3 
Answer: 
y=0 


32. Let 4x = h be the system in Exercise 31. 
(a) Solve by Cramer's rule. 
(b) Solve by Gauss-Jordan elimination. 
(c) Which method involves fewer computations? 
33. Prove that if det(.4) = 1 and all the entries in A are integers, then all the entries in 47! are integers. 


34. Let Ах = Ъ be a system of n linear equations in n unknowns with integer coefficients and integer constants. 
Prove that if det(.4) = 1, the solution x has integer entries. 


35. Let 

a bc 
А= d e F 

g h i 

Assuming that det(.4) — — 7, find 

(a) det(5.4) 

(b) det(A 7) 

(c) det(247 1) 

(d) det((2.4) 7) 


(e) agd 

det b k e 
c if 

Answer: 

(a) —189 

b) 1 

(b) 5 
8 

(с) - 

d) -L 

(d) 5Є 

(е) 7 


36. In each part, find the determinant given that A is a 4 x 4 matrix for which det(A) = —2. 
(a) det( — A) 


(b) det( 47!) 
(с) det(247) 
(d) det(.4?) 


37. In each part, find the determinant given that А is a 3 x 3 matrix for which det(.4) = 7. 
(а) det(3.4) 


(b) det(.A 1) 
(c) det(24 7l) 
(d) det((24) 1) 


Answer: 


(a) 189 
b) 1 
GE 


(с) 8 
7 
(à) 1 
56 
38. Prove that a square matrix A is invertible if and only if 47 А is invertible. 
39. Show that if A is a square matrix, then det(A T 4) = det(AA zy 


True-False Exercises 


In parts (a)- (I) determine whether the statement is true or false, and justify your answer. 
(a) If A is a 3 x 3 matrix, then det(24) = 2 деї( А). 
Answer: 


False 


(b) If A and B are square matrices of the same size such that det(A) = det(E), then det(.A + В) = 2 det( 4). 
Answer: 


False 
(c) If A and B are square matrices of the same size and A is invertible, then 
det(.A 1 B4) = det(B) 
Answer: 


True 


(d) A square matrix А is invertible if and only if det(.4) = 0. 
Answer: 


False 


(е) The matrix of cofactors of A is precisely [adj(.4)] T 


Answer: 


True 


(f) For every » x м matrix A, we have 


А ај 4) = (еї 4))2,, 
Answer: 


True 


(g) If A is a square matrix and the linear system 4x = () has multiple solutions for x, then det(A) = 0. 
Answer: 


True 


(b) If A is апу; x у matrix and there exists an y x; | matrix b such that the linear system 4x = h has no solutions, 
then the reduced row echelon form of A cannot be Jy. 


Answer: 


True 


(i) If E is an elementary matrix, then Ex = Q has only the trivial solution. 
Answer: 


True 


(j) If A is an invertible matrix, then the linear system 4x — () has only the trivial solution if and only if the linear 
system 4 1x — () has опу the trivial solution. 


Answer: 


True 


(k) If A is invertible, then adj(.4) must also be invertible. 
Answer: 


True 
(D) If A has a row of zeros, then so does adj(.A4). 


Answer: 


False 
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Chapter 2 Supplementary Exercises 


In Exercises 1—8, evaluate the determinant of the given matrix by (a) cofactor expansion and (b) using 
elementary row operations to introduce zeros into the matrix. 


ur 


Answer: 


—18 


2.| 7 —1 
-2 -—6 
2 


5.|3 0 —1 
1 1 4 
04 2 


Answer: 
—10 


6„|—5 1 
з 0 
2 


hme he 


9. Evaluate the determinants in Exercises 3—6 by using the arrow technique (see Example 7 in Section 2.1). 


Answer: 


Exercise 3: 24; Exercise 4: 0; Exercise 5: 10; Exercise 6: 248 


10. (a) Construct a 4 x 4 matrix whose determinant is easy to compute using cofactor expansion but hard to 
evaluate using elementary row operations. 


(b) Construct a 4 х 4 matrix whose determinant is easy to compute using elementary row operations but 
hard to evaluate using cofactor expansion. 


11. Use the determinant to decide whether the matrices in Exercises 1—4 are invertible. 


Answer: 


The matrices in Exercises 1—3 are invertible, the matrix in Exercise 4 is not. 


12. Use the determinant to decide whether the matrices in Exercises 5—8 are invertible. 


In Exercises 13—15, find the determinant of the given matrix by any method. 


13.) 5 4=3 
b-2 —3 
Answer: 
=A + 5b – 21 
14.|3 =4 a 
a^ 1 2 
2 а—1 4 
15.10 0 0 0 -3 
00 0-4 0 
00 —1 0 Q 
02 0 0 0 
50 0 0 O0 
Answer: 
—120 
16. Solve for x. 
= 10 -3 
з |= 2 х —6 
E E77 


In Exercises 17—24, use the adjoint method (Theorem 2.3.6) to find the inverse of the given matrix, if it 


exists. 


17. The matrix in Exercise 1. 


Answer: 


xn» sole 


18. The matrix in Exercise 2. 


19. The matrix in Exercise 3. 


Answer: 

d. d uud 
8 8 8 
d le 
8 24 24 
di oho od 
4 12 12 


20. The matrix in Exercise 4. 


21. The matrix in Exercise 5. 


Answer: 
du LL 
5 5 10 
d oe 2 
5 5 5 
E XE CONES 
5 5 10 


22. The matrix in Exercise 6. 


23. The matrix in Exercise 7. 
Answer: 


10 2 52 27 
329 329 329 329 


24. The matrix in Exercise 8. 


25. Use Cramer's rule to solve for x and y” in terms of x and y. 


26. 


27. 


28. 
29. 


30. 


ши Ж. 
ARE 
з 
y = 5^ T 5 
Answer: 
bled... gaius 
x =F HY у 5^ +% 
Use Cramer's rule to solve for x’ and y” in terms of x and y. 


x =x" cosĝ— y! sinf 


y =x" sinf +y" cos 
By examining the determinant of the coefficient matrix, show that the following system has a nontrivial 
solution if and only if œ = ĝ. 
X + y + o = 0 
X ++ y+ # = 0 
ax + бу + z = 0 


Let A be a 3 x 3 matrix, each of whose entries is 1 or 0. What is the largest possible value for det(A)? 


(a) For the triangle in the accompanying figure, use trigonometry to show that 
bcosy + ccos@ = a 
ccosa + acosy = b 
acos + bcosa = c 


and then apply Cramer's rule to show that 
b? c? La? 
Abc 


(b) Use Cramer's rule to obtain similar formulas for соз and cosy. 


COS Q = 


b a 


с 


Figure Ex-29 


Answer: 
(b) cos "ы Ты ы cos o dE et b^ -c 
| Zac ' i 2ab 
Use determinants to show that for all real values of A, the only solution of 


x = 2y = А 
x= y = Ау 
is x=0, y= 0. 


31. Prove: If A is invertible, then adj(_A) is invertible and 


[adj(A)] 1 = а py Anadis 25 


32. Prove: If A is an у x 4; matrix, then 
det[adj{ A) ] = [det(.4) ] 1 


33. Prove: If the entries in each row of an » х у matrix A add up to zero, then the determinant of А is zero. 
[Hint: Consider the product 4 Y, where X is the » x | matrix, each of whose entries is one. 


3. (a) In the accompanying figure, the area of the triangle 45С' can be expressed as 


area АВС = area ADEC + area CEF B — area ADE B 


Use this and the fact that the area of a trapezoid equals i the altitude times the sum of the parallel 


sides to show that 
xi yı 1 
area ABC — 1 хз y2 1 


2 
хз уз 1 


[Note: In the derivation of this formula, the vertices аге labeled such that the triangle is traced 
counterclockwise proceeding from (ху, y1) tO (x5, уз) 0 (x3, уз): For a clockwise orientation, the 
determinant above yields the negative of the area.] 


(b) Use the result in (a) to find the area of the triangle with vertices (3, 3), (4, 0), (-2, -1). 


A(x). ур 


| 
| 
| 
=o 
D E 


Figure Ex-34 


35. Use the fact that 21,375, 38,798, 34,162, 40,223, and 79,154 are all divisible by 19 to show that 


213735 
38798 
34162 
40223 
79 154 


is divisible by 19 without directly evaluating the determinant. 
36. Without directly evaluating the determinant, show that 
sina cosa sinfa+d) 
sing соз 8 sn(g8--6)|—O0 


siny cosy sin(y +d) 
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Euclidean Vector Spaces 


CHAPTER CONTENTS 


3.1. Vectors in 2-Space, 3-Space, and n-Space 


3.2. Norm, Dot Product, and Distance in R” 
3.3. Orthogonality 

3.4. The Geometry of Linear Systems 

3.5. Cross Product 


INTRODUCTION 


Engineers and physicists distinguish between two types of physical quantities—scalars, 
which are quantities that can be described by a numerical value alone, and vectors, which 
are quantities that require both a number and a direction for their complete physical 
description. For example, temperature, length, and speed are scalars because they can be 
fully described by a number that tells “how much"—a temperature of 20°C, a length of 5 
cm, or a speed of 75 km/h. In contrast, velocity and force are vectors because they require 
a number that tells “how much" and a direction that tells “which way"— say, a boat 
moving at 10 knots in a direction 45? northeast, or a force of 100 Ib acting vertically. 
Although the notions of vectors and scalars that we will study in this text have their 
origins in physics and engineering, we will be more concerned with using them to build 
mathematical structures and then applying those structures to such diverse fields as 
genetics, computer science, economics, telecommunications, and environmental science. 
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3.1 Vectors in 2-Space, 3-Space, and n-Space 


Linear algebra is concerned with two kinds of mathematical objects, “matrices” and “vectors.” We are already 
familiar with the basic ideas about matrices, so in this section we will introduce some of the basic ideas about 
vectors. As we progress through this text we will see that vectors and matrices are closely related and that 
much of linear algebra is concerned with that relationship. 


Geometric Vectors 


Engineers and physicists represent vectors in two dimensions (also called 2-space) or in three dimensions 
(also called 3-space) by arrows. The direction of the arrowhead specifies the direction of the vector and the 
length of the arrow specifies the magnitude. Mathematicians call these geometric vectors. The tail of the 
arrow is called the initial point of the vector and the tip the terminal point (Figure 3.1.1). 


Terminal point 


Initial point 


Figure 3.1.1 


In this text we will denote vectors in boldface type such as a, b, v, w, and x, and we will denote scalars in 
lowercase italic type such as a, Ё, v, w, and x. When we want to indicate that a vector v has initial point А and 
terminal point B, then, as shown in Figure 3.1.2, we will write 


v— AB 
B 
» 
A 
v=AB 
Figure 3.1.2 


Vectors with the same length and direction, such as those in Figure 3.1.3, are said to be equivalent. Since we 
want a vector to be determined solely by its length and direction, equivalent vectors are regarded to be the 
same vector even though they may be in different positions. Equivalent vectors are also said to be equal, 


which we indicate by writing 
y —W 


LA 


Equivalent vectors 


Figure 3.1.3 


The vector whose initial and terminal points coincide has length zero, so we call this the zero vector and 
denote it by 0. The zero vector has no natural direction, so we will agree that it can be assigned any direction 
that is convenient for the problem at hand. 


Vector Addition 


There are a number of important algebraic operations on vectors, all of which have their origin in laws of 
physics. 


Parallelogram Rule for Vector Addition 


If v and w are vectors in 2-space or 3-space that are positioned so their initial points coincide, then the 
two vectors form adjacent sides of a parallelogram, and the sum y + w is the vector represented by 
the arrow from the common initial point of y and w to the opposite vertex of the parallelogram 
(Figure 3.1.4a). 


У+ м 


(а) (b) (c) 


Figure 3.1.4 


Here is another way to form the sum of two vectors. 


Triangle Rule for Vector Addition 


If y and yw are vectors in 2-space or 3-space that are positioned so the initial point of w is at the 
terminal point of y, then the sum y + w is represented by the arrow from the initial point of y to the 
terminal point of г (Figure 3.1.45). 


In Figure 3.1.4c we have constructed the sums y + w and w + y by the triangle rule. This construction makes 
it evident that 


v+Ww=Ww+v (1) 
and that the sum obtained by the triangle rule is the same as the sum obtained by the parallelogram rule. 


Vector addition can also be viewed as a process of translating points. 


Vector Addition Viewed as Translation 


If v, w, and y + w are positioned so their initial points coincide, then the terminal point of ү -+ w can 
be viewed in two ways: 


1. The terminal point of y + w is the point that results when the terminal point of y is translated in 
the direction of y by a distance equal to the length of y (Figure 3.1.5a). 


2. The terminal point of y + w is the point that results when the terminal point of y is translated in 
the direction of y by a distance equal to the length of y (Figure 3.1.55). 


Accordingly, we say that y + w is the translation of y by w or, alternatively, the translation of w by v. 


Figure 3.1.5 


Vector Subtraction 


In ordinary arithmetic we can write a — b = a + ( — Ё), which expresses subtraction in terms of addition. 
There is an analogous idea in vector arithmetic. 


Vector Subtraction 


The negative of a vector y, denoted by =y, is the vector that has the same length as y but is 
oppositely directed (Figure 3.1.6a), and the difference of y from w, denoted by y — y, is taken to be 


the sum 


w—v-—w--(-v) (2) 


(b) (c) 


Figure 3.1.6 


The difference of y from w can be obtained geometrically by the parallelogram method shown in Figure 
3.1.66, or more directly by positioning w and y so their initial points coincide and drawing the vector from the 
terminal point of y to the terminal point of ẹ (Figure 3.1.6c). 


Scalar Multiplication 


Sometimes there is a need to change the length of a vector or change its length and reverse its direction. This 
is accomplished by a type of multiplication in which vectors are multiplied by scalars. As an example, the 
product 2y denotes the vector that has the same direction as y but twice the length, and the product 22 
denotes the vector that is oppositely directed to y and has twice the length. Here is the general result. 


Scalar Multiplication 


If y is a nonzero vector in 2-space or 3-space, and if k is a nonzero scalar, then we define the scalar 
product of y by k to be the vector whose length is ld times the length of y and whose direction is the 
same as that of y if k is positive and opposite to that of y if k is negative. If — 0 or y = 0, then we 
define ду to be Q. 


Figure 3.1.7 shows the geometric relationship between a vector y and some of its scalar multiples. In 
particular, observe that ( — 1)v has the same length as y but is oppositely directed; therefore, 


(-—l)v= =v (3) 


Figure 3.1.7 


Parallel and Collinear Vectors 


Suppose that y and ұ are vectors in 2-space or 3-space with a common initial point. If one of the vectors is a 
scalar multiple of the other, then the vectors lie on a common line, so it is reasonable to say that they are 
collinear (Figure 3.1.8a). However, if we translate one of the vectors, as indicated in Figure 3.1.85, then the 
vectors are parallel but no longer collinear. This creates a linguistic problem because translating a vector does 
not change it. The only way to resolve this problem is to agree that the terms parallel and collinear mean the 
same thing when applied to vectors. Although the vector () has no clearly defined direction, we will regard it 
to be parallel to all vectors when convenient. 


kw ^ 


(a) (b) 


Figure 3.1.8 


Sums of Three or More Vectors 


Vector addition satisfies the associative law for addition, meaning that when we add three vectors, say u, v, 
and үү, it does not matter which two we add first; that is, 

u+ (v 4-w) = (u-4- v) --w 
It follows from this that there is no ambiguity in the expression u + y + w because the same result is obtained 
no matter how the vectors are grouped. 


A simple way to construct u + v + w 15 to place the vectors “йр to tail" in succession and then draw the 
vector from the initial point of u to the terminal point of ẹ (Figure 3.1.9a). The tip-to-tail method also works 
for four or more vectors (Figure 3.1.95). The tip-to-tail method also makes it evident that if u, y, and үр are 
vectors in 3-space with a common initial point, then u + v + w is the diagonal of the parallelepiped that has 
the three vectors as adjacent sides (Figure 3.1.9c). 


(а) (b) (c) 


Figure 3.1.9 


Vectors in Coordinate Systems 


Up until now we have discussed vectors without reference to a coordinate system. However, as we will soon 
see, computations with vectors are much simpler to perform if a coordinate system is present to work with. 


The component forms of the zero vector are 
0 = (0, 0) in 2-space and 0 = (0, 0, 0) in 
3-space. 


If a vector y in 2-space or 3-space is positioned with its initial point at the origin of a rectangular coordinate 
system, then the vector is completely determined by the coordinates of its terminal point (Figure 3.1.10). We 
call these coordinates the components of y relative to the coordinate system. We will write v — (v1, v3) to 
denote a vector y in 2-space with components (v, уз), and v = (v, уз, уз) to denote a vector y in 3-space 
with components (v, v3, v3). 


y - 


(UV). Va, U3) 


Figure 3.1.10 


It should be evident geometrically that two vectors in 2-space or 3-space are equivalent if and only if they 
have the same terminal point when their initial points are at the origin. Algebraically, this means that two 
vectors are equivalent if and only if their corresponding components are equal. Thus, for example, the vectors 


v= (v1, v2, v3) and w= (w1 w2, w3) 
in 3-space are equivalent if and only if 


Yy=W1, V2—Wj, V3=W3 


Remark It may have occurred to you that an ordered pair (у, уз) can represent either a vector with 


components У and v3 or a point with components У} and уз (and similarly for ordered triples). Both are valid 
geometric interpretations, so the appropriate choice will depend on the geometric viewpoint that we want to 
emphasize (Figure 3.1.11). 


Figure 3.1.11 The ordered pair (v1, уз) can represent a point or a vector. 


Vectors Whose Initial Point Is Not at the Origin 


It is sometimes necessary to consider vectors whose initial points are not at the origin. If P, P denotes the 
vector with initial point P (x1, y1) and terminal point P5(x», уз), then the components of this vector are 
given by the formula 


P,Pj—(x3—xi, Y2—Y1) (4) 


That is, the components of Р\Р, are obtained by subtracting the coordinates of the initial point from the 
coordinates of the terminal point. For example, in Figure 3.1.12 the vector Р\Р, is the difference of vectors 
ОР, апа ОР], so 

Р\Р›= OP; — OP} = (x2, уз) — (х1, y1) = 2-41. У2—У1) 
As you might expect, the components of a vector in 3-space that has initial point P4 (x1, у, Z1) and terminal 
point P5(x», уз, 22) are given by 


T 
РР) = (x3—x1. У2—У1, 22—21) (5) 


Py уз) 


— —À — 
v=P,P, =OP,- OP, 


Figure 3.1.12 


EXAMPLE 1 Finding the Components of a Vector + 


The components of the vector y = BP; with initial point Р (2, — 1, 4) and terminal point 
P4(7, 5, = 8) are 
v—(7—2, 5—(—1), (2-8) 2 4) = (5, 6, — 12) 


n-Space 


The idea of using ordered pairs and triples of real numbers to represent points in two-dimensional space and 
three-dimensional space was well known in the eighteenth and nineteenth centuries. By the dawn of the 
twentieth century, mathematicians and physicists were exploring the use of “higher-dimensional” spaces in 
mathematics and physics. Today, even the layman is familiar with the notion of time as a fourth dimension, an 
idea used by Albert Einstein in developing the general theory of relativity. Today, physicists working in the 
field of “string theory" commonly use 11-dimensional space in their quest for a unified theory that will 
explain how the fundamental forces of nature work. Much of the remaining work in this section is concerned 
with extending the notion of space to n-dimensions. 


To explore these ideas further, we start with some terminology and notation. The set of all real numbers can 
be viewed geometrically as а line. It is called the real line and is denoted by R or 21. The superscript 
reinforces the intuitive idea that a line is one-dimensional. The set of all ordered pairs of real numbers (called 
2-tuples) and the set of all ordered triples of real numbers (called 3-tuples) are denoted by p? and R3, 
respectively. The superscript reinforces the idea that the ordered pairs correspond to points in the plane 
(two-dimensional) and ordered triples to points in space (three-dimensional). The following definition extends 
this idea. 


DEFINITION 1 


If n is a positive integer, then an ordered n-tuple is a sequence of л real numbers (у, V3, ..., Уу). 
The set of all ordered n-tuples is called n-space and is denoted by R”. 


Remark You can think of the numbers in an n-tuple (у, v3, ..., Ум) as either the coordinates of a 
generalized point or the components of a generalized vector, depending on the geometric image you want to 
bring to mind—the choice makes no difference mathematically, since it is the algebraic properties of n-tuples 
that are of concern. 


Here are some typical applications that lead to n-tuples. 


Experimental Data A scientist performs an experiment and makes n numerical measurements each time 
the experiment is performed. The result of each experiment can be regarded as a vector 
y = (уц, Уз,..„ Ум) In R” in which у, уз, ..., уу, are the measured values. 


Storage and Warehousing A national trucking company has 15 depots for storing and servicing its trucks. 
At each point in time the distribution of trucks in the service depots can be described by a 15-tuple 

X = (х1, X3, ..., X15) in which X is the number of trucks in the first depot, x2 is the number in the second 
depot, and so forth. 


Electrical Circuits A certain kind of processing chip is designed to receive four input voltages and 
produces three output voltages in response. The input voltages can be regarded as vectors in R4 and the 
output voltages as vectors in R3. Thus, the chip can be viewed as a device that transforms an input vector 
у = (v4, v3, V3, v4) in Д4 into an output vector w= (w1, w2, w3) in RÊ. 

Graphical Images One way in which color images are created on computer screens is by assigning each 


pixel (an addressable point on the screen) three numbers that describe the hue, saturation, and brightness 
of the pixel. Thus, a complete color image can be viewed as a set of 5-tuples of the form y = (x, y, й, s, b) 
in which x and y are the screen coordinates of a pixel and Л, s, and b are its hue, saturation, and brightness. 
Economics One approach to economic analysis is to divide an economy into sectors (manufacturing, 
services, utilities, and so forth) and measure the output of each sector by a dollar value. Thus, in an 
economy with 10 sectors the economic output of the entire economy can be represented by a 10-tuple 

S = (81, 52, -.., 510) in which the numbers 51, 52, ..., 510 are the outputs of the individual sectors. 


Mechanical Systems Suppose that six particles move along the same coordinate line so that at time ¢ their 
coordinates are x1, x2, ..., Xg and their velocities are v4, уз, ..., ув, respectively. This information can be 
represented by the vector 


V — (X1, X2, X3, X4, X 5, X6, V1, V2, V3, V4, V 5, V6, É) 


іп 212. This vector is called the state of the particle system at time f. 


Albert Einstein (1879-1955) 


Historical Note The German-born physicist Albert Einstein immigrated to the United States in 
1935, where he settled at Princeton University. Einstein spent the last three decades of his life 
working unsuccessfully at producing a unified field theory that would establish an underlying link 
between the forces of gravity and electromagnetism. Recently, physicists have made progress on the 
problem using a framework known as string theory. In this theory the smallest, indivisible 
components of the Universe are not particles but loops that behave like vibrating strings. Whereas 


Einstein's space-time universe was four-dimensional, strings reside in an 11-dimensional world that is 
the focus of current research. 
Umage: © Bettmann/© Corbis] 


Operations on Vectors іп R” 


Our next goal is to define useful operations on vectors in R". These operations will all be natural extensions 
of the familiar operations on vectors in 22 and 23. We will denote a vector y in R” using the notation 


v= (v1. v3. --- Vy) 


and we will call 0 = (0, 0, ..., 0) the zero vector. 


We noted earlier that in 22 and 5? two vectors are equivalent (equal) if and only if their corresponding 


components are the same. Thus, we make the following definition. 


DEFINITION 2 


Vectors v = (v1, v3, -... v4) and w= (wy, W2, ..., Wy) in А” are said to be equivalent (also called 
equal) if 
Vj=W1, V2=W,... Уу = у 


We indicate this by writing v = w. 


EXAMPLE 2 Equality of Vectors + 


(a, b, c, d) = (1, —4,2, 7) 
if and only ifa = 1, b = —4,c—2,andg = 7. 


Our next objective is to define the operations of addition, subtraction, and scalar multiplication for vectors in 
R”. To motivate these ideas, we will consider how these operations can be performed on vectors in 22 using 


components. By studying Figure 3.1.13 you should be able to deduce that if v = (у, уз) and w = (у, w2), 


then 


у= (vi м, v2-- w3) 


kw = (kv, kv3) 


(6) 


(7) 


In particular, it follows from 7 that 


and hence that 


—v=(-l)v=(-v1, —v3) 


ту (=v) = (01 =v], w2—v3) 


(| +), Va +W) 


Figure 3.1.13 


Motivated by Formulas 6—9, we make the following definition. 


DEFINITION 3 


If v = (v1, v3, ..., Ум) and w= (wy, W2, -... Wy) are vectors іп R”, and if k is any scalar, then we 


define 


v+w= (v4 --W1, V2 +W, -Vy + Wy) 


ky = (kv, kv2, .. ум) 


=v = (—v1. —Yj. ---= Vy) 


w—v-—w-F(—v) = (wj —vi, w2—v3,.. Wy — Vy) 


(10) 


(11) 


(12) 


(13) 


(8) 


(9) 


In words, vectors are added (or subtracted) by 
adding (or subtracting) their corresponding 
components, and a vector is multiplied by a 
scalar by multiplying each component by that 
scalar. 


EXAMPLE 3 Algebraic Operations Using Components + 


Ifv = (1, = 3, 2) and w= (4, 2, 1), then 
v+w=(5, – 1, 3), 2v—(2, —6,4) 
—w-—í(—-4, —2—1) v—w-—v-F(—w)—(—3, —5,1) 


The following theorem summarizes the most important properties of vector operations. 


THEOREM 3.1.1 


If u, v, and ware vectors in А", and if k and m are scalars, then: 
(а) += у Ра 

(b) 9+ v) +w=u-+ (v+w) 

(c) 1+0=0 +0 = 0и 

(d) 9+ (= ш =0 

(е) kiu + v) = ku + kv 

(0 (k--m)u- а + mu 

(g) Gu) = (km)u 


(h) lu=u 


We will prove part (b) and leave some of the other proofs as exercises. 
Proof (b) Letu = (и, #3, ...,#y,), V = (v1, V2, ..., Vy), and w= (у, W2, ..., Wy). Then 


(utv)+w —((1, uh -o нь) + (V1, V2, --. Vnd) + (91, Wh -s Wr) 
= (Uu, -F v1, ua + V2, -n My F Và) + (W1, W2, -n Wy) [ Vector addition] 
= ((u4-- v1) + wy, (u3 +72) --w3,..., (му + v4) + Wy) [Vector addition] 
= (uy + (v1 +1), u3 + (v2 + w3),... ux H Yn + жи) [Regroup] 
= ti], 2, -n Ux) + (V1 --W1, V2 + W2, -n Уу F Wy) [ Vector addition] 
=u+ (v +w) 


The following additional properties of vectors in R” can be deduced easily by expressing the vectors in terms 


of components (verify). 


THEOREM 3.1.2 


If v is a vector іп R” and / is a scalar, then: 


(а) 0v —0 
(mu 
(c) (= у= =F 


Calculating Without Components 


One of the powerful consequences of Theorems 3.1.1 and 3.1.2 is that they allow calculations to be performed 
without expressing the vectors in terms of components. For example, suppose that x, a, and b are vectors in 
R”, and we want to solve the vector equation x + a = h for the vector x without using components. We could 


proceed as follows: 


x+a=b [Given] 
(x--a)--(—a)—b--(—a) Add the negative of a to both sides 
x--(a--(—a)-—b-—a Part (5) of Theorem 3.1.1 
х+б=Ь—а Part (d) of Theorem 3.1.1 
x—b-a Part (c) of Theorem 3.1.1 


While this method is obviously more cumbersome than computing with components in 2”, it will become 
important later in the text where we will encounter more general kinds of vectors. 


Linear Combinations 


Addition, subtraction, and scalar multiplication are frequently used in combination to form new vectors. For 
example, if V1, V2, and V3 are vectors in R"., then the vectors 
u—2v|--3v;--v3 and w= 7v, — fv + 8v3 


are formed in this way. In general, we make the following definition. 


DEFINITION 4 


If үү is a vector in А”, then w is said to be a linear combination of the vectors v4, v2, ..., Vy in R” if it 


сап be expressed in the form 
w= рур + iava +... + kv, (14) 


where %1, 3, ..., Æp are scalars. These scalars are called the coefficients of the linear combination. In 
the case where p — 1, Formula 14 becomes w = у, so that a linear combination of a single vector 
is just a scalar muliple of that vector. 


Note that this definition of a linear combination 
is consistent with that given in the context of 
matrices (see Definition 6 in Section 1.3). 


Application of Linear Combinations to Color Models 


Colors on computer monitors are commonly based on what is called the RGB color model. Colors in 
this system are created by adding together percentages of the primary colors red (R), green (G), and 
blue (B). One way to do this is to identify the primary colors with the vectors 


r=(1,0,0) (pure red), 
g= (0, 1, 0) (pure green), 
Ь= (0,0,1) (pure blue) 
іп 25 and to create all other colors by forming linear combinations ofr, g, and b using coefficients 
between 0 and 1, inclusive; these coefficients represent the percentage of each pure color in the mix. 
The set of all such color vectors is called RGB space or the RGB color cube (Figure 3.1.14). Thus, 
each color vector c in this cube is expressible as a linear combination of the form 
с = kir + 25 + k3b 
=k,(1, 0, 0) + &2¢0, 1, 0) + 43(0, 0, 1) 
= (k1, k2, k3) 
where 0 < &; < 1. As indicated in the figure, the corners of the cube represent the pure primary colors 


together with the colors black, white, magenta, cyan, and yellow. The vectors along the diagonal 
running from black to white correspond to shades of gray. 


Blue Cyan 


(1, 0, 1) 


Black 


Red ¢ 
(1, 0, 0) (1, 1, 0) 


Figure 3.1.14 


Alternative Notations for Vectors 
Up to now we have been writing vectors in R” using the notation 

V = (V1, V2, ..., Ум) (15) 
We call this the comma-delimited form. However, since a vector in R” is just a list of its n components in a 


specific order, any notation that displays those components in the correct order is a valid way of representing 
the vector. For example, the vector in 15 can be written as 


v= [vi v2...va] (16) 
which is called row-matrix form, or as 
v1 
v3 
у=|” (17) 
Yn 


which is called column-matrix form. The choice of notation is often a matter of taste or convenience, but 
sometimes the nature of a problem will suggest a preferred notation. Notations 15, 16, and 17 will all be used 
at various places in this text. 


Concept Review 
* Geometric vector 
* Direction 

* Length 

* [nitial point 

* Terminal point 


* Equivalent vectors 


Zero vector 

* Vector addition: parallelogram rule and triangle rule 
* Vector subtraction 

* Negative of a vector 

* Scalar multiplication 

* Collinear (1.e., parallel) vectors 

* Components of a vector 

* Coordinates of a point 

* n-tuple 


* n-space 


Vector operations in n-space: addition, subtraction, scalar multiplication 


* Linear combination of vectors 


Skills 

* Perform geometric operations on vectors: addition, subtraction, and scalar multiplication. 
* Perform algebraic operations on vectors: addition, subtraction, and scalar multiplication. 
* Determine whether two vectors are equivalent. 

* Determine whether two vectors are collinear. 

* Sketch vectors whose initial and terminal points are given. 

* Find components of a vector whose initial and terminal points are given. 


* Prove basic algebraic properties of vectors (Theorems 3.1.1 and 3.1.2). 


Exercise Set 3.1 


In Exercises 1—2, draw a coordinate system (as in Figure 3.1.10) and locate the points whose coordinates are 
given. 


1. (a) (3,4,5) 
(b) (-3, 4, 5) 
(c) (3, -4, 5) 
(d) (3, 4, —5) 
(e) (3, ^4, 5) 
(f) 3, 4, —5) 


Answer: 


(а) 


(b) 


(c) 


(d) 


(f) 


2. (a) (0,3,-3) 
(b) (3,-3,0) 
(c) (-3, 0, 0) 
(d) (3, 0, 3) 
(e) (0, 0, -3) 
(f) (0, 3, 0) 


In Exercises 3-4, sketch the following vectors with the initial points located at the origin. 
З. (ау v1 = (3, 6) 


(b) уз=(—4, —8) 
(c) v32 (—4, —3) 


(d) Y4— (3,4, 5) 
(e) ¥5 = (3, 3, 0) 
(f) ¥6=(— 1, 0, 2) 


Answer: 


(a) 


(b) 


(с) 


(4) 


(е) 


(f) 


4. (а) v1 = (5, —4) 
(b) ¥2 = (3, 0) 
(c) ¥3= (0, —7) 
(d) ¥4= (0, 0, — 3) 
(e) ¥5= (0,4, — 1) 


(ф ув= (2,2,2) 
In Exercises 5—6, sketch the following vectors with the initial points located at the origin. 


5. (a) Р1(4, 8), P3(3, 7) 
(b) 1G, – 5), Pa(-4, -7) 
(с) P13, —7,2), Рз(-2,5, —4) 


Answer: 


(a) 


(b) y 


(c) 


6. (а) P1( 5,0), P2( —3, 1) 
(b) P1(0,0), | P2(3, 4) 
(с) Ё1(—1,0,2), P2(0, — 1,0) 
(d) P1(2,2, 2), Р2(0, 0, 0) 


In Exercises 7-8, find the components of the vector P, P». 


7.(а) P1(3, 5), Р2(2, 8) 
œ) 21(5, —2,1), P2(2,4, 2) 


Answer: 


(à) PP; — (—1,3) 
(b PyP = (—3, 6, 1) 

8. (a) P1(—6,2), P3(—4, — 1) 
(b) 21(0,0,0), P3(—1,6,1) 


9. (a) Find the terminal point of the vector that is equivalent to u = (1, 2) and whose initial point is 4(1, 1) 


(b) Find the initial point of the vector that is equivalent to u = (1, 1, 3) and whose terminal point is 
B(—1, = 1,2). 


Answer: 


(a) The terminal point is B(2, 3). 
(b) The initial point is 4(—2, —2, — 1). 


10. (a) Find the initial point of the vector that is equivalent to u = (1, 2) and whose terminal point is 8(2, 0) 
(b) Find the terminal point of the vector that is equivalent to u = (1, 1, 3) and whose initial point is 
ACO, 2, 0). 


11. Find a nonzero vector u with terminal point (2(5, 0, — 5) such that 
(a) u has the same direction as v — (4, —2, — 1). 


(b) u is oppositely directed to v = (4, = 2, = 1). 


Answer: 
(а) u—(—1,2, —4) is one possible answer. 
(b u= (7, —2, —6) is one possible answer. 


12. Find a nonzero vector u with initial point P( — 1, 3, — 5) such that 
(a) u has the same direction as v — (6, 7, — 3). 


(b) u is oppositely directed to v — (6, 7, — 3). 


13. Letu = (4, — 1), = (0, 5), and w= ( — 3, — 3). Find the components of 


(a) +» 
(b) v —3u 
(с) 2(u — Sw) 


(d) Зу — Z(u + 2w) 
(е) —3б#— 2u + v) 
(f) (—2u— v) — 5(v + 3w) 


Answer: 


(а) utw=(1, —4) 

(b) ¥— 30 = (—12, 8) 

(с) 2(u — Sw) = (38, 28) 

(d) 3v — 2(u + 2w) = (4, 29) 

(е) —3(w— 2u + v) = (33, — 12) 

(f) (72u — v) — 5(v + 3w) = (37, 17) 


14. Let u = (~ 3, 1, 2), v= (4, 0, = 8), and w= (6, — 1, — 4). Find the components of 


15. 


16. 
17. 


(а) ү — үү 

(b) ŝu + 2% 

(c) ц 

(d) 50у — 40) 

(e) —3(¥ — 8м) 

(f) (Zu— 7w) — (8v +u) 


Letu—(—3,2, 1, 0), v = (4, 7, = 3, 2), and w= (5, — 2, 8, 1). Find the components of 
(a) V—W 

(b) 29 + 7v 

(с) =u + (v — 4w) 

(d) 6(u— 5v) 

(e) —v —w 

(f) (6v —w) — (4u+ v) 


Answer: 


(а) (71,9, —11, 1) 
(b) (22,53, —19, 14) 

(с) (713,13, —36, —2) 
(d) (90, —114,60, —36) 
(с) (79. —5, —5, -3) 
( (27,29, —27,9) 


Let u, v, апа w be the vectors in Exercise 15. Find the vector x that satisfies 5x — 2v = 2(w — 5x). 
Letu= (5, — 1, 0, 3, —3),v—(—1, —1, 7, 2, 0), and w= ( —4, 2, — 3, — 5, 2). Find the 
components of 

(a) w—u 

(b) 2v + Ju 

(c) =w + 3(v — u) 

(d) 9( — v - 40 — №) 

(е) 72 Ow + v) + (20 + wW) 


(f) iw- 5v + 2u) -- v 


Answer: 


(а) # —ч=(—9,3, —3, —8,5) 

(b) 2v + Зо = (13, —5, 14, 13, — 9) 

(с) =W + 3(v—u) = (=14, = 2, 24, 2,7) 

(d) 5(—v + 4u—w) = (125, — 25, — 20, 75, — 70) 


1 


о 


19. 


20. 


2 


m 


(e) 72 (Ow + v) + (2u +w) = (32, — 10, 1, 27, — 16) 


O liw- ыз г пе NES e ш 
lw 5+ + 2u) ++ (> 3-12, -2, -2 


.Letu — (1,2, —3,5,0), v= (0,4, — 1, 1, 2), айж (7, 1, —4, — 2, 3). Find the components of 


(а) Vw 

(b) 3(Zu — v) 

(с) (Ou — v) — (2u + Ам) 

Letu—(—3,1,2,4,4), v= (4,0, —8, 1, 2), andw= (6, —1, = 4, 3, = 5). Find the components 
of 

(а) V— № 

(b) ŝu + 2% 

(с) (20 — Aw) — (8v +u) 


Answer: 

(a) V-w=(-2,1, —4, —2,7) 

(b) би + 2v = (—10, 6, — 4, 26, 28) 

(c) (2u = 7w) — (8v +u) = (—77, 8, 94, = 25, 23) 


Let u, v, and w be the vectors in Exercise 18. Find the components of the vector x that satisfies the 
equation Зу + v = 2w = 3x + 2w- 


. Let u, v, and w be the vectors in Exercise 19. Find the components of the vector x that satisfies the 


equation 2u — v + x = 7x + W- 


Answer: 
к m dec od 
Cub us 3° 6 
22. For what value(s) of t, if any, is the given vector parallel tou = (4, = 1)? 
(a) (85, —2) 
(b) (84, 2£) 
© (1, Jg 
23. Which of the following vectors in 2° are parallel to u = ( — 2, 1, 0, 3, 5, 1)? 


(a) (4, 2,0, 6, 10, 2) 
(b) (4, – 2,0, —6, — 10, – 2) 
(с) (0, 0, 0, 0, 0, 0) 


Answer: 


(a) Not parallel 
(b) Parallel 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


(c) Parallel 


Let u= (2, 1, 0, 1, = 1) and v = ( = 2, 3, 1, 0, 2) . Find scalars a and b so that 
au--bv—(—8,8,5, = 1, 7). 


Letu= (1, = 1, 3, 5) and v = (2, 1, 0, — 3). Find scalars a and b so that au + v= (1, — 4, 9, 18). 
Answer: 


a=3, b= =] 
Find all scalars <, c2, and сз such that 


c1(1, 2, 0) +с2(2, 1, 1) -- e3(0, 3, 1) = (0, 0, 0) 


Find all scalars с, €2, and ¢3 such that 
ci(1, = 1, 0) +¢9(3, 2, 1) +¢3(0, 1,4) = (= 1, 1, 19) 


Answer: 
c1—2, сэ= =], e3—-5 
Find all scalars ^1, €2, and ¢3 such that 
c1( —1,0,2) +с3(2, 2, = 2) --ea(1, 22,1) = (= 6, 12, 4) 
Let u; = (= 1, 3, 2, 0), ug = (2,0,4, = 1), u3 = (7, 1, 1, 4), and ug = (6, 5, 1, 2). Find scalars с, 
c2, C3, and C4 such that сүү + сзиз + c3u3 + c4u4 = (0,5,6, = 3). 
Answer: 


c1=1, c2=1, c3= = 1, c4=1 
Show that there do not exist scalars с, C2, and ¢3 such that 
с1(1, 0, 1, 0) +с2(1, 0, 22, 1) +с3(2, 0, 1, 2) = (1, = 2, 2, 3) 


Show that there do not exist scalars c1, ¢2, and сз such that 
c1(—2,9,6) +e3( — 3, 2, 1) -- e3(1, 7, 5) = (0, 5, 4) 
Consider Figure 3.1.12. Discuss a geometric interpretation of the vector 
u = OP; + 3 (OP2— obi) 
Let P be the point (2, 3, — 2) and О the point (7, —4, 1). 
(a) Find the midpoint of the line segment connecting P and Q. 


(b) Find the point on the line segment connecting P and Q that is 2 of the way from P to О. 


Answer: 


34. Let P be the point (1, 5, 7). If the point (4, 0, — 6) is the midpoint of the line segment connecting Р and 
Q, what is Q? 


35. Prove parts (a), (c), and (d) of Theorem 3.1.1. 
36. Prove parts (e)-(/1) of Theorem 3.1.1. 
37. Prove parts (a)-(c) of Theorem 3.1.2. 


True-False Exercises 
In parts (a)-(k) determine whether the statement 15 true or false, and justify your answer. 
(a) Two equivalent vectors must have the same initial point. 

Answer: 


False 


(b) The vectors (a, 5) and (a, 5, 0) are equivalent. 
Answer: 


False 


(с) If k is a scalar and v is a vector, then v and kv are parallel if and only if k > 0. 
Answer: 


False 


(d) The vectors v + (u + w) and (w + v) + u are the same. 
Answer: 


True 


(e) Ifa Еу == м» then v — w. 
Answer: 


True 


(f) If a and b are scalars such that gy + bv = 0), then u and v are parallel vectors. 
Answer: 


False 


(g) Collinear vectors with the same length are equal. 
Answer: 


False 


(b) If (a, b, c) + (x, y, Z) = (x, y, 2), then (a, b, c) must be the zero vector. 


Answer: 


True 


(i) If k and m are scalars and u and v are vectors, then 


(k +m) iu + v) = ku H mv 


Answer: 


False 


(j) If the vectors v and w are given, then the vector equation 
3(2v = x) = 5x — dw -- v 


can be solved for x. 
Answer: 


True 
(k) The linear combinations 21V, + 42v2 and bv, + b4v;3 can only be equal if aj = 54 and g3 = 55. 


Answer: 


False 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


3.2 Norm, Dot Product, and Distance in R” 


In this section we will be concerned with the notions of length and distance as they relate to vectors. We will 
first discuss these ideas in 22 and 25 and then extend them algebraically to R”. 


Norm of a Vector 


In this text we will denote the length of a vector v by the symbol ||v||, which is read as the norm of v, the 
length of v, or the magnitude of v (the term *norm" being a common mathematical synonym for length). As 
suggested in Figure 3.2.1а, it follows from the Theorem of Pythagoras that the norm of a vector (у, уз) in 22 


1S 


lvli = pv? + v2 (1) 


Similarly, for a vector (v1, уз, уз) in 23, it follows from Figure 3.2.1b and two applications of the Theorem of 
Pythagoras that 


\у2 = (OR)? + (RP)? = (OQ)? + (QR? + (ЕР)? =v? + v2 +v? 


and hence that 


[у = yv? + v3 4- v? Q) 


Motivated by the pattern of Formulas 1 and 2 we make the following definition. 


DEFINITION 1 


If v = (v1, v2, -- Ум) is a vector in R", then the norm of v (also called the length of v or the 
magnitude of v) is denoted by ||v||, and is defined by the formula 


llvll = v2 tv? +v 4... + v2 (3) 


EXAMPLE 1 Calculating Norms + 


It follows from Formula 2 that the norm of the vector v = ( — 3, 2, 1) in g? is 


lvl = y (—3)? 4 22 4 12 = {па 


and it follows from Formula 3 that the norm of the vector v = (2, — 1, 3, — 5) in g* is 


1м = V 22 + (202 4 32 + (25)? = үзә 


(b) 


Figure 3.2.1 


Our first theorem in this section will generalize to &" the following three familiar facts about vectors іп 22 and 
R?: 

e Distances are nonnegative. 

* The zero vector is the only vector of length zero. 

e Multiplying a vector by a scalar multiplies its length by the absolute value of that scalar. 

It is important to recognize that just because these results hold in д2 and RÊ does not guarantee that they hold 


in R"—their validity in R” must be proved using algebraic properties of n-tuples. 


THEOREM 3.2.1 


If v is a vector in А”, and if kis an scalar, then: 
y 


(a) 11120 
(b) ||v|| = 0 if and only if y = 0 
(c) Mel = [lili 


We will prove part (c) and leave (a) and (b) as exercises. 


Proof (c) If v = (у, v3, -- Yy), then бу = (К, kv, ..., kv), so 


evil — (vy)? + (evn)? +++ + vs? 
= (0) +уй+ зз Уя) 
= k++. 2 
= lvii 


Unit Vectors 


A vector of norm 1 is called a unit vector. Such vectors are useful for specifying a direction when length is not 
relevant to the problem at hand. You can obtain a unit vector in a desired direction by choosing any nonzero 
vector v in that direction and multiplying v by the reciprocal of its length. For example, if v is a vector of 


length 2 in 22 or R3, then ГЫ is a unit vector in the same direction as v. More generally, if v is апу nonzero 


vector іп R”, then 
а= тту (4) 
llv 
defines a unit vector that is in the same direction as v. We can confirm that 4 is a unit vector by applying part 


(c) of Theorem 3.2.1 with k = 1 / ||v|| to obtain 


lall = lvl] = ||| = А11 = |у|| = 1 


1 
II¥ll 
The process of multiplying a nonzero vector by the reciprocal of its length to obtain a unit vector is called 
normalizing у. 


WARNING 


Sometimes you will see Formula 4 expressed as 


¥ 
u = 


Ill 


This is just a more compact way of writing that 
formula and is not intended to convey that v is 
being divided by ||v ||. 


EXAMPLE 2 Normalizing a Vector 3 


Find the unit vector u that has the same direction as v — (2, 2, — 1). 


Solution The vector v has length 


lvl = 22+ 22+ (212 23 


Thus, from 4 
мар" и С 2 at 
mi i = (3.4. | 


As a check, you may want to confirm that ||u|| = 1. 


The Standard Unit Vectors 


When a rectangular coordinate system is introduced іп 22 or R3, the unit vectors in the positive directions of 
the coordinate axes are called the standard unit vectors. In 22 these vectors are denoted by 


і= (1,0) and j=(0, 1) 
and іп 23 by 
і= (1,0, 0), j=(0,1,0), and К= (0, 0, 1) 
(Figure 3.2.2). Every vector v = (v1, уз) іп 22 and every vector v = (v4, v2, v4) in 23 can be expressed as a 


linear combination of standard unit vectors by writing 


v= (v1, v3) 7 v1(1, 0) + v3(0, 1) = v;i + v3j (5) 


v= (v4, v3, V3) = v1(1, 0, 0) + 900, 1, 0) + v4(0, 0, 1) = vii + vaj + v3k (6) 
Moreover, we can generalize these formulas to R” by defining the standard unit vectors in R™ to be 
8&12(1,0,0,.., 0, ез = (0, 1, 0,..., 0),... e,=(0,0,0,... 1) (7) 
in which case every vector v = (v1, v3, ..., Ум) іп R" can be expressed as 


V — (V1, V2, ..., Vn) = Уе + V202 +... + уме (8) 


EXAMPLE 3 Linear Combinations of Standard Unit Vectors — 


(2, = 3, 4) = 21 — 3j + 4k 
(7, 3, —4, 5) = 7e, + 3e; —4e3 + 5ед 


i (0. 1. 0) 
(1,0, 0) 


(b) 


Figure 3.2.2 


Distance in R" 


If P, and P are points in д2 or д?, then the length of the vector P P; is equal to the distance d between the 
two points (Figure 3.2.3). Specifically, if P4 (x1, y1) and Р(х), уз) are points in £7, then Formula 4 of 
Section 3.1 implies that 


d = РР = | (x3 xi? (узу)? (9) 


This is the familiar distance formula from analytic geometry. Similarly, the distance between the points 
Pi (x1, y1, z1) and P5(x5, уз, z2) in 3-space is 


d(u, v) = ЇРҮР?Ї = y (xg - x1)? + (уу) + (2 Y? (10) 


Motivated by Formulas 9 and 10, we make the following definition. 


DEFINITION 2 


Ifu = (uj, 82, -- му) and v = (v1, v2, .., v) are points in E", then we denote the distance between 
u and v by d (о, v) and define it to be 


d(u, v) = |u-vl = Y (u =v)? + (u3 v3) + - - - + (un —v,? (11) 


Pa 


Р, 
d = |P P.) 


Figure 3.2.3 


We noted in the previous section that n-tuples 
can be viewed either as vectors or points in R”. 
In Definition 2 we chose to describe them as 
points, as that seemed the more natural 
interpretation. 


EXAMPLE 4 Calculating Distance in R” < 


If 
u—(1,3, 22,7) and v=(0,7, 2,2) 


then the distance between u and v 15 


dlu, у) = 4 (120)? + (3-7)? 4 (22-2)? 4 (1 22)? = ү58 


Dot Product 


Our next objective is to define a useful multiplication operation on vectors in 22 and 22 and then extend that 
operation to 2”. To do this we will first need to define exactly what we mean by the “angle” between two 
vectors in 22 ог 25. For this purpose, let u and v be nonzero vectors in 22 or 53 that have been positioned so 
that their initial points coincide. We define the angle between u and v to be the angle 0 determined by u and v 
that satisfies the inequalities 0 < 0 < т (Figure 3.2.4). 


DEFINITION 3 


If u and v are nonzero vectors in 22 or 23, and if 0 is the angle between и and v, then the dot product 


(also called the Euclidean inner product) of u and v is denoted by үү. y and is defined as 
u: v= [[u||||v]|cos Ө (12) 


If y — Q or y — 0), then we define y - y to be 0. 


The angle 0 between u and v satisfies 0 < 0 < т. 


Figure 3.2.4 


The sign of the dot product reveals information about the angle 0 that we can obtain by rewriting Formula 12 
as 


ШЇ (13) 


Since () < @ < т, it follows from Formula 13 and properties of the cosine function studied in trigonometry that 
* gis acute ify - y > 0. 

* gis obtuse if g - y <0. 

e g—m/2ifu-v—0. 


EXAMPLE 5 DotProduct «4 


Find the dot product of the vectors shown in Figure 3.2.5. 


Figure 3.2.5 


Solution The lengths of the vectors are 


ul — 1 and |= y8—2/2 


and the cosine of the angle 0 between them is 
cos (45°) =1/ V2 


Thus, it follows from Formula 12 that 


u- v= [lulllviicos 9 = (1) (2ү2){(1 / (2) =2 


EXAMPLE 6 A Geometry Problem Solved Using Dot Product + 


Find the angle between a diagonal of a cube and one of its edges. 


Solution Let k be the length of an edge and introduce a coordinate system as shown in Figure 3.2.6. 
If we let uy = (&, 0, 0), ug = (0, &, 0), and uz = (0, 0, 4), then the vector 

d= (E, k, k) =u] + 05 + u3 
is a diagonal of the cube. It follows from Formula 13 that the angle Ө between d and the edge u4 
satisfies 


шщ: k? 1 


cos B = m———Ó— — —— 


II qoaa) үз 


With the help of a calculator we obtain 


(0, k, 0) 


(k, 0, 0) 


Figure 3.2.6 


Note that the angle Ө obtained in Example 6 
does not involve k. Why was this to be 
expected? 


Component Form of the Dot Product 


For computational purposes it is desirable to have a formula that expresses the dot product of two vectors in 
terms of components. We will derive such a formula for vectors in 3-space; the derivation for vectors in 


2-space is similar. 


Let u = (#1, #2, из) and v = (v4, уз, уз) be two nonzero vectors. If, as shown in Figure 3.2.7, 0 is the angle 
between u and v, then the law of cosines yields 


— 2 
ІРО = llull? + 112 — 20fullilviicos ө (14) 


Josiah Willard Gibbs (1839-1903) 


Historical Note The dot product notation was first introduced by the American physicist and 
mathematician J. Willard Gibbs in a pamphlet distributed to his students at Yale University in the 
1880s. The product was originally written on the baseline, rather than centered as today, and was 
referred to as the direct product. Gibbs's pamphlet was eventually incorporated into a book entitled 
Vector Analysis that was published in 1901 and coauthored with one of his students. Gibbs made major 
contributions to the fields of thermodynamics and electromagnetic theory and is generally regarded as 
the greatest American physicist of the nineteenth century. 

Umage: The Granger Collection, New York] 


. —_ . 
Since PO = v — u, we can rewrite 14 as 


full vlicos Ө = 3 (ial? + Ivi — Iv — ull?) 
or 
u: v=} (Iul? + Ivi? — Iv — ul?) 
Substituting 
lul? —22--22--22, Iv? — vivi? 
and 


2 2 j 2 
Iv = ull“ = (v1—21) + (v2 = u2)" + (va — u3) 


we obtain, after simplifying, 


U-V — RV, PUW F Z3V3 (15) 


Although we derived Formula 15 and its 
2-space companion under the assumption that u 
and v are nonzero, it turned out that these 
formulas are also applicable if u = Q ory = 0 
(verify). 


The companion formula for vectors in 2-space is 
uU-V — у PUW? (16) 


Motivated by the pattern in Formulas 15 and 16, we make the following definition. 


DEFINITION 4 


Ifu = (21, 22, -.., Чу) and v = (v4, v2, -.., v) are vectors in 5", then the dot product (also called the 
Euclidean inner product) of u and v is denoted by үү. y and is defined by 


U'V — UV, -F Uu2V2 F... нууу (17) 


In words, to calculate the dot product 
(Euclidean inner product) multiply 
corresponding components and add the 
resulting products. 


EXAMPLE 7 Calculating Dot Products Using Components — 


(a) Use Formula 15 to compute the dot product of the vectors и and у in Example 5. 
(b) Calculate y - y for the following vectors in 2“: 
u=(=-1,3,5,7), vw=(-—3, —4,1,0) 


Solution 
(a) The component forms of the vectors аге u = (0, 0, 1) and v = (0, 2, 2). Thus, 
u: v= (0)(0) + (00(2) + (1) (2) =2 


which agrees with the result obtained geometrically in Example 5. 


(o) азу=(—1)(—3)+(3)(—4) + (5)(1) + 0)(0 = —4 


Р(и\. u^, иа) 


D QU, Uz, 03) 


у 


Figure 3.2.7 


Algebraic Properties of the Dot Product 
In the special case where y = y in Definition 4, we obtain the relationship 
v:v—v? + vi ау = а? (18) 


This yields the following formula for expressing the length of a vector in terms of a dot product: 


П = yv- v (19) 


Dot products have many of the same algebraic properties as products of real numbers. 


THEOREM 3.2.2 


If u, v, and w are vectors in R”, and ifk isa scalar, then: 

(а) u* v —v:u [Symmetry property] 

(b) u: (v--w) —u:v--u:*w [Distibutive property] 

(c) ku: v) = (ku): v [Homogeneity property] 

(d) v: v zÜandv:v—Üif andonlyif v=0 [Positivity property ] 


We will prove parts (c) and (d) and leave the other proofs as exercises. 


Proof (c) Letu = (и, 12, ..., y) and v= (v4, v3, -- v4). Then 


k(u-v) | —k(ujv4 Huwa +... + unn) 
= (kuj)vi + (kug)va +... + (kuy)v4 = (ku) - v 


Proof (d) The result follows from parts (a) and (b) of Theorem 3.2.1 and the fact that 


үзү=үу + vaya +... + Улун m v? EVE ^... và = [vll 
1 2 


The next theorem gives additional properties of dot products. The proofs can be obtained either by expressing 
the vectors in terms of components or by using the algebraic properties established in Theorem 3.2.2. 


THEOREM 3.2.3 


If u, v, and ware vectors in A", and if kis a scalar, then: 
(a) 0- v2 v:.0—0 

(b) (u--v)w-—u:w--v*w 

(с) Ut (Ww) —u:v—u:w 

(d (u—v)'w—u:w—v:w 


(е) ku: v) =u; (kv) 


We will show how Theorem 3.2.2 can be used to prove part (6) without breaking the vectors into components. 
The other proofs are left as exercises. 


Proof (b) 
(u--v):w  —w-(u--v) [By symmetry] 
—w-:u--w:v [By distributivity] 
—u:w--v:w [By symmetry] 


Formulas 18 and 19 together with Theorems 3.2.2 and 3.2.3 make it possible to manipulate expressions 
involving dot products using familiar algebraic techniques. 


EXAMPLE 8 Calculating with Dot Products + 


(u—2v):(5u--4v) =u: (3u+4v) —2v- (3u + 4v) 
= 3(u-u) -- 4d(u* v) – 6(v-u) —S(v* v) 
= За? — 2(u - v) — 8llvil? 


Cauchy—Schwarz Inequality and Angles in R" 


Our next objective is to extend to R” the notion of “angle” between nonzero vectors и and v. We will do this 
by starting with the formula 


__ -1 u'v 
б= сө, Ur) = 


which we previously derived for nonzero vectors їп 22 апа 25. Since dot products and norms have been 
defined for vectors in 2”, it would seem that this formula has all the ingredients to serve as a definition of the 
angle Ө between two vectors, и and v, in R”. However, there is a fly in the ointment, the problem being that the 
inverse cosine in Formula 20 is not defined unless its argument satisfies the inequalities 


eu 
^а = (20 


Fortunately, these inequalities do hold for all nonzero vectors in R” as а result of the following fundamental 
result known as the Cauchy—Schwarz inequality. 


THEOREM 3.2.4 Cauchy—Schwarz Inequality 


Ifu = (1,, 43, ..., шу) and v = (v4, v2, -- v4) are vectors in А”, then 


ju: v| < [ul] [vl] (22) 


or in terms of components 


1/2 1/2 
шуу! --ugvg +... + yvy < и? Ful +...я us Ui pve bo va | (23) 


We will omit the proof of this theorem because later in the text we will prove a more general version of which 
this will be a special case. Our goal for now will be to use this theorem to prove that the inequalities in 21 hold 
for all nonzero vectors in 2”. Once that is done we will have established all the results required to use Formula 
20 as our definition of the angle between nonzero vectors u and v in R”. 


To prove that the inequalities in 21 hold for all nonzero vectors in R”, divide both sides of Formula 22 by the 
product ||u|| || v|| to obtain 
[шс], . | u:v E 
<1 or equivalently |— — —|«1 
Д0 “ee ТУ 


from which 21 follows. 


Hermann Amandus Schwarz (1843-1921) 


C 


R17 
Viktor Yakovlevich Bunyakovsky (1804-1889) 


Historical Note The Cauchy—Schwarz inequality is named in honor of the French mathematician 
Augustin Cauchy (see p. 109) and the German mathematician Hermann Schwarz. Variations of this 
inequality occur in many different settings and under various names. Depending on the context in 
which the inequality occurs, you may find it called Cauchy's inequality, the Schwarz inequality, or 
sometimes even the Bunyakovsky inequality, in recognition of the Russian mathematician who 
published his version of the inequality in 1859, about 25 years before Schwarz. 

[/mages: wikipedia (Schwarz); wikipedia (Bunyakovsky) | 


Geometry іп R” 


Earlier in this section we extended various concepts to R” with the idea that familiar results that we can 
visualize in 22 and g? might be valid in R" as well. Here are two fundamental theorems from plane geometry 


whose validity extends to R”: 
* The sum of the lengths of two side of a triangle is at least as large as the third (Figure 3.2.8). 
* The shortest distance between two points is a straight line (Figure 3.2.9). 


The following theorem generalizes these theorems to E". 


ТНЕОКЕМ 3.2.5 


If u, v, and ware vectors in R”, and if k is any scalar, then: 
(a) їз + vll < [full + [|®|| [ Triangle inequality for vectors] 
(b) d(u, v) < d(u, м) -- d(w, v) [Triangle inequality for distances] 


Proof (a) 


lu--vl?^ = (u-v)-(u--v) = (u-u) + 2u: v) + (v v) 
= [щ|?+2(ш-з) + [у] 
< ч? --2[: ¥| + у? + Property of absolute value 
lall? + 2 [hall wl] + 1102 + Cauchy — Schwarz inequality 
= (llall + Ill? 


Proof (b) It follows from part (a) and Formula 11 that 


d(u v) = |= |= l(u—w) + Gw— v) |l 
< Па — wil + [w= || = d (u, w) + dw, v) 


п+у 


llu + || < lul] + llv] 


Figure 3.2.8 


d(u, v) < d(u, w) + d(w, v) 


Figure 3.2.9 


It is proved in plane geometry that for any parallelogram the sum of the squares of the diagonals is equal to the 
sum of the squares of the four sides (Figure 3.2.10). The following theorem generalizes that result to 2”. 


THEOREM 3.2.6 Parallelogram Equation for Vectors 


If u and v are vectors in A", then 


Iu + vi? + I — vi^ = 2 (Iul? + 112) (24) 
Proof 
|u ++ E 4 lu- vli? = (u+ v) : (u+ v) + (u— v) : (u — v) 
= 2(u-u)--2(v- v) 
= 2 (hull? + 112) 


Figure 3.2.10 


We could state and prove many more theorems from plane geometry that generalize to 2”, but the ones already 
given should suffice to convince you that R” is not so different from 22 and д2 even though we cannot 


visualize it directly. The next theorem establishes a fundamental relationship between the dot product and norm 
in R”. 


ТНЕОКЕМ 3.2.7 


If u and v are vectors in R” with the Euclidean inner product, then 
p 


nd A PR 
ау у ш+ vi^ - zlu- vl (25) 
Proof 
|u--vl? = (0+): (аә) = lul? + 2(u - v) + Iv]? 
[а= |2 = (а-ә): (а-у) = Iul? - 2(u-v) + lvl? 


from which 25 follows by simple algebra. 


Note that Formula 25 expresses the dot product 
in terms of norms. 


Dot Products as Matrix Multiplication 


There are various ways to express the dot product of vectors using matrix notation. The formulas depend on 
whether the vectors are expressed as row matrices or column matrices. Here are the possibilities. 


If A is an » x y matrix and u and v аге » x | matrices, then it follows from the first row in Table 1 and 
properties of the transpose that 


Auty = v7 (Au) = (valu = (4 v) ^d =u: Aly 
u: Av = (Av) а= (РА ju v" (47а) = Асу 
The resulting formulas 


Au:v—u- Aly (26) 


u: Av — Au: v (27) 


provide an important link between multiplication by an у x »; matrix А and multiplication by 47. 


EXAMPLE 9 Verifying That Au: v-u- Av < 


Suppose that 


Then 


from which we obtain 


1 -2 

A= 2 4 

—1 0 

Au - 

АТу - 
Au:v 


u: Ау 


—2 

2|, у= 0 

4 5 
=| 7 
2|=|10 
4 5 
—2 -7 
о|=| 4 
5 -1 


=7(— 2) + 10(0) + 5(5) = 11 
=(=1)(-7) +24) +4(-D=11 


Thus, 4u - v — u- A? v as guaranteed by Formula 26. We leave it for you to verify that Formula 


27 also holds. 


Form 


u a column matrix and 
у a column matrix 


uarow matrix and v a 
column matrix 


u a column matrix and 
varow matrix 


Dot Product 


о-у = шу = уги 


Т 


ТаЫе 1 


Example 


u/v—[1 


Form Dot Product Example 


uarow matrix and v a 
row matrix 


5 
uv = [1 —3 5]|4 
0 
1 


уй — [5 4 0]|—3 
5 


A Dot Product View of Matrix Multiplication 


Dot products provide another way of thinking about matrix multiplication. Recall that if A= [a " is ап; 
matrix and 8 = [4;;] is an к x » matrix, then the i jth entry of AB is 


айу Haba +... + agb 
which is the dot product of the ith row vector of А 
[ац йз -.. Gi] 
and the jth column vector of B 
ba; 
by 


xr 


Thus, if the row vectors of A are rj, r2, ..., ry and the column vectors of В are сј, €2, ..., Cp, then the matrix 


product AB can be expressed as 


р] rj'€2 ... Ty Cy 
rj't| тә) ... Е: 
АВ = E 1 2 2 
Губ] Ym'C2 .-. Fm‘ ly 


Application of Dot Products to ISBN Numbers 


Although the system has recently changed, most books published in the last 25 years have been 
assigned a unique 10-digit number called an International Standard Book Number or ISBN. The first 
nine digits of this number are split into three groups—the first group representing the country or group 
of countries in which the book originates, the second identifying the publisher, and the third assigned to 
the book title itself. The tenth and final digit, called a check digit, is computed from the first nine digits 
and is used to ensure that an electronic transmission of the ISBN, say over the Internet, occurs without 
error. 


To explain how this is done, regard the first nine digits of the ISBN as a vector b in &?, and let a be the 


Q8) 


vector 
a= (1, 2, 3,4, 5, 6,7, 8, 9) 
Then the check digit c is computed using the following procedure: 
1. Form the dot product а. b. 
2. Divide a - Ъ by 11, thereby producing a remainder c that is an integer between 0 and 10, inclusive. 
The check digit is taken to be c, with the proviso that с = 10 is written as X to avoid double digits. 
For example, the ISBN of the brief edition of Calculus, sixth edition, by Howard Anton is 
0—471—15307—89 
which has a check digit of 9. This is consistent with the first nine digits of the ISBN, since 
a:b—(1,2,5,4,5,6,7,8,9)- (0,4, 7, 1, 1, 5, 3, 0, 7) = 152 
Dividing 152 by 11 produces a quotient of 13 and a remainder of 9, so the check digit is ¢ = 9. If an 
electronic order is placed for a book with a certain ISBN, then the warehouse can use the above 


procedure to verify that the check digit is consistent with the first nine digits, thereby reducing the 
possibility of a costly shipping error. 


Concept Review 


* Norm (or length or magnitude) of a vector 


Unit vector 


Normalized vector 
* Standard unit vectors 


* Distance between points in R” 


Angle between two vectors in R” 


* Dot product (or Euclidean inner product) of two vectors in 2” 


Cauchy-Schwarz inequality 


Triangle inequality 

* Parallelogram equation for vectors 

Skills 

e Compute the norm of a vector in E". 

* Determine whether a given vector іп R” is a unit vector. 
e Normalize a nonzero vector in 2”. 

• Determine the distance between two vectors in R”. 

• Compute the dot product of two vectors in R”. 

* Compute the angle between two nonzero vectors in 2”. 


e Prove basic properties pertaining to norms and dot products (Theorems 3.2.1—3.2.3 and 3.2.5—3.2.7). 


Exercise Set 3.2 


In Exercises 1-2, find the norm of v, a unit vector that has the same direction as v, and a unit vector that is 
oppositely directed to v. 


1. (а) = (4, — 3) 
(b) у= (a, ese) 
(c) ¥= (1,0,2, 1, 3) 


Answer: 
2s _У__{4 _3\ v _ __4 3 
9 1м1=5, таг (5: = 5). тг = (5: 5) 
= Talos ge шш шш шеш = 
menus de s) е y s 
1 == 
(c) Ivl] = ү15, iW c "WT 77509255 


2. (ау у= (—5, 12) 
(b) у= (1, – 1, 2) 
(с) у=(—2,3,3, — 1) 
In Exercises 3-4, evaluate the given expression withu — (2, —2, 3), v— (1, — 3, 4), and 
w= (3,6, —4). 
3. (a) о + v] 
(b) lall + Iiv 
(c) || — 2u + 2v|| 
(d) ||3u—5v + | 


Answer: 


(a) а + || = 783 

(b) [full + [lvl] = y 17 + {26 

(c) || -2u + 2v|| = 203 

(d) || -3u — 5v + | = {466 
4. (a). [+ v + wi| 

(b) llu — vl| 

(с) Il3vll — 3[[®|| 


(a) llull- Iv 


In Exercises 5—6, evaluate the given expression with u = (= 2, = 1,4, 5), v= (3, 1, = 5, 7), and 
w= (— 6,2, 1, 1). 
5. (а) 1130 — 5v + wl| 

(b) llull — Slv] + Iw 

(с) lla 


Answer: 


(а) ||3u = 5v + || = y 2570 
(b) ||3ul] — 5111 + Iwll = 3y 46 — 10/21 + J 42 
(c) || — llull¥l] = 2y 966 


6. (a) lull — 21911 — 31 
(b) lall + || — 2v] + [| — 5wl 


(с) llla = vibe] 


7. Let v = ( = 2, 3, 0, 6). Find all scalars k such that ||| = 5. 


Answer: 
Ld ue uec 
к= 2, k 5 


8. Let v = (1, 1, 2, — 3, 1). Find all scalars к such that ||icv|| = 4. 


In Exercises 9-10, findu · v, о · о, and y- y. 


9. (a) u= (3, 1,4), v= (2, 2, —4) 
(b) u= (1, 1,4, 6), = (2, —2, 3, —2) 


Answer: 


(а) ut Y= —8, u u= 26, v: v— 24 


(b) u:v—0, u:-u—54, v: v— 21 


10. (а) u—(1,1, 22,2), v2(—1,0,5, 1) 
(b) u=(2, – 1,1,0, 22), v= (1, 2, 2, 2, 1) 


In Exercises 11—12, find the Euclidean distance between u and v. 


П. (a) u= (3, 3, 3), v= (1, 0, 4) 
(b) u= (0, – 2, = 1, 1), v=(—3, 2,4,4) 
(с) а= (3, – 3, 22,0, – 3, 13, 5), 
v—(—4,1, 1,5,0, – 11, 4) 


Answer: 


(а) |u— v] — y 14 
(b) ||u— vll = p59 
(c) lu- vl = (677 


12. (а) u= (1,2, – 3,0), v= (5, 1,2, —2) 
(b) u= (2, 2 1, —4,1,0,6, —3, 1), 
v=(-2, —1,0,3,7, 2, —5, 1) 
(с) u= (0, 1, 1, 1, 2), v= (2, 1,0, = 1, 3) 


13. Find the созше of ће angle between the vectors in each part of Exercise 11, апа then state whether the 
angle is acute, obtuse, or 90°. 


Answer: 
15 
(8) cos — FEE 0 is acute 
4 
(b) cos8 — — deas ` 0 is obtuse 
(с) cos 9 — — 136 


{225 180 ; Ө 15 obtuse 


14. Find the cosine of the angle between the vectors in each part of Exercise 12, and then state whether the 
angle is acute, obtuse, or 90°. 


15. Suppose that a vector a in the xy-plane has a length of 9 units and points in a direction that is 120? 
counterclockwise from the positive x-axis, and a vector b in that plane has a length of 5 units and points in 
the positive y-direction. Find a . b. 


Answer: 
a-b -453 


16. Suppose that a vector a in the xy-plane points in a direction that is 47° counterclockwise from the positive 
x-axis, and a vector b in that plane points in a direction that is 43° clockwise from the positive x-axis. What 
can you say about the value of a - 5? 


In Exercises 17-18, determine whether the expression makes sense mathematically. If not, explain why. 


17. (а) 9° (v +) 


(b) 9° (v + w) 
(c) lu: vll 
(d) Qu * v) = llull 


Answer: 


18. 


19. 


20. 


21. 


22. 


23. 


(а) u- (v * w) does not make sense because y - w is a scalar. 
(b) u- (v +w) makes sense. 
(с) ||u - v|| does not make sense because the quantity inside the norm is a scalar. 


(d) (u* v) = ||u|| makes sense since the terms are both scalars. 


(a) lall- lvl 
(b) (uv) -wW 

(с) uv) - € 

(d) k'u 

Find a unit vector that has the same direction as the given vector. 
(3-9 

(b) (1.7) 


(c) (73 2, үз) 


(4) (1,2,3,4,5) 


Answer: 


Find a unit vector that is oppositely directed to the given vector. 
(a) (—12, —5) 

(b) G, = 35, —3) 

(c) (= 6, 8) 


(d) (—3, 1, үб. 3) 


State a procedure for finding a vector of a specified length m that points in the same direction as a given 
vector v. 


Me. E сй 2 


If ||v|| = 2 and ||w]| = 3, what are the largest and smallest values possible for ||v = |||? Give a geometric 
explanation of your results. 


Find the cosine of the angle 0 between u and v. 
(a) u= (2, 3), v= (5, – 7) 

(b) u2(—6, —2), v= (4,0) 

(c) w= (1, – 5,4), v= (3, 5, 3) 


(d) ч=(—2,2,3), у= (1,7, —4) 


Answer: 

(а) cosg — — HU. 
y 962 

(b) cosg — — —— 
ү10 

(c) cos = 0 

(d) cos 8 — 0 


24. Find the radian measure of the angle 0 (with 0 < 8 < т) between u and v. 
(a) (1, — 7) and (21, 3) 
(b) (0, 2) and (3, — 3) 
(с) (= 1, 1, 0) and (0, = 1, 1) 
(d) (1, — 1, 0) and (1, 0, 0) 


In Exercises 25—26, verify that the Cauchy-Schwarz inequality holds. 
25. (a) а= (3, 2), v= (4, — 1) 


(0) u=(—3, 1, 0), v= (2, = 1, 3) 
(с) uz (0, 2, 2, 1), v= (1, 1, 1, 1) 


Answer: 


(a) ju v| — 10, |fulllivll = y 13y 17 = 14.866 
(b) ju: v|— 7, |fulllivll = y 10y 14 = 11.832 
(c) fur v| —5. ПЫ = (3)(2) = 6 

26. (а) u — (4,1, 1), у= (1,2, 3) 
(b) u= (1, 2, 1, 2, 3), v= (0, 1, 1, 5, —2) 
(c) а= (1,3, 5, 2, 0, 1), т = (0, 2, 4, 1, 3, 5) 


27. Let pg = (хо, yg, 20) and p = (x, у, 2). Describe the set of all points (x, y, 2) for which ||p = pp|| = 1. 
Answer: 


A sphere of radius 1 centered at (xg. yg, zg)- 


28. (а) Show that the components of the vector v = (v1, уз) in Figure Ех-28а are v1 = ||¥||cos 8 and 
v3 = ||v||sin 8. 
(b) Let u and v be the vectors in Figure Ex-28b. Use the result in part (a) to find the components of 
4u — 5v- 


(а) 


Figure Ех-28 


29. Prove parts (a) and (b) of Theorem 3.2.1. 
30. Prove parts (a) and (c) of Theorem 3.2.3. 
31. Prove parts (d) and (e) of Theorem 3.2.3. 


32. Under what conditions will the triangle inequality (Theorem 3.2.5a) be an equality? Explain your answer 
geometrically. 


33. What can you say about two nonzero vectors, и and v, that satisfy the equation ||u ++ v|| = ||u]| + ||v ||? 


34. (a) What relationship must hold for the point p = (a, 5, c) to be equidistant from the origin and the 
xz-plane? Make sure that the relationship you state is valid for positive and negative values of a, b, and 
А 


(b) What relationship must hold for the point p = (a, b, с) to be farther from the origin than from the 
xz-plane? Make sure that the relationship you state is valid for positive and negative values of a, b, and 
c 


True-False Exercises 
In parts (a)-(]) determine whether the statement is true or false, and justify your answer. 
(а) If each component of a vector in R? is doubled, the norm of that vector is doubled. 


Answer: 


True 
(b) In z^, the vectors of norm 5 whose initial points аге at the origin have terminal points lying on a circle of 


radius 5 centered at the origin. 
Answer: 


True 


(c) Every vector in R” has a positive norm. 
Answer: 


False 


(d) If v is a nonzero vector in E", there are exactly two unit vectors that are parallel to v. 
Answer: 


True 


(e) If ||u|| = 2, ||¥|| = 1, апау - y = 1, then the angle between u and v is q / 3 radians. 
Answer: 


True 


(f) The expressions {u · v) +w andu · (v +w) are both meaningful and equal to each other. 
Answer: 


False 


(g) Ifu-y —u-w. then v = W. 
Answer: 


False 


(b) If y - y = 0, then either y = Q or y =Q. 
Answer: 


False 


(i) In д2, if u lies in the first quadrant and v lies in the third quadrant, then ц - y cannot be positive. 
Answer: 


True 


(j) For all vectors u, v, and w in 2”, we have 


[ч + v + w]| < | + [|| + [hel 
Answer: 


True 
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3.3 Orthogonality 


In the last section we defined the notion of “angle” between vectors in R”. In this section we will focus on the notion of 
“perpendicularity.” Perpendicular vectors in &" play an important role in a wide variety of applications. 


Orthogonal Vectors 


Recall from Formula 20 in the previous section that the angle 0 between two nonzero vectors u and v in R" is defned by the 


formula 


-1 пу 
0 = cos (Туг) 
alivi 


It follows from this that # — x / 2 if and only if y - y = 0. Thus, we make the following definition. 


DEFINITION 1 


Two nonzero vectors u and v in E" are said to be orthogonal (or perpendicular) if y - y = 0. We will also agree that the 
zero vector in E" is orthogonal to every vector in R”. A nonempty set of vectors іп R” is called an orthogonal set if all 
pairs of distinct vectors in the set are orthogonal. An orthogonal set of unit vectors is called an orthonormal set. 


EXAMPLE 1 Orthogonal Vectors + 


(a) Show that u = ( — 2, 3, 1, 4) and v = (1, 2, 0, — 1) are orthogonal vectors іп g^. 


(b) Show that the set S= (i, j, k} of standard unit vectors is an orthogonal set in 23. 


Solution 
(а) The vectors are orthogonal since 
u:v—-(-2)() + (3)(2) + (1) ©) + ()( 71) 20 
(b) We must show that all pairs of distinct vectors are orthogonal, that is, 
i-j=i-k=j-k=0 
This is evident geometrically (Figure 3.2.2), but it can be seen as well from the computations 
i-j= (1, 0, 0) · (0, 1,0) =0 
i-k= (1,0, 0) · (0, 0, 1) =0 
ј'К = (0, 1,0) · (0, 0, 1) =0 


In Example 1 there is no need to check that 
jpi-k:i-k:j-0 

since this follows from computations in the example and 

the symmetry property of the dot product. 


Lines and Planes Determined by Points and Normals 


One learns in analytic geometry that a line in 22 is determined uniquely by its slope and one of its points, and that a plane іп 2215 
determined uniquely by its “inclination” and one of its points. One way of specifying slope and inclination is to use a nonzero 
vector n, called a normal, that is orthogonal to the line or plane in question. For example, Figure 3.3.1 shows the line through the 
point Pp (xg, yg) that has normal n = (a, 5) and the plane through the point Pp (xg, yg, 20) that has normal n = (a, b, c). Both 
the line and the plane are represented by the vector equation 


n: PoP —0 (1) 


where P is either an arbitrary point (х, y) on the line or an arbitrary point (x, у, 2) in the plane. The vector PoP can be expressed 


in terms of components as 


PoP =(x—x0, y—yo) [line] 


PoP = (x — xo, y— yo, 2—20) [plane] 


а(х — хо) +b% —yo)=0 [line] 2) 


a(x = х0) +2(у — yo) +e@—z9)=0 [plane] (3) 
These are called the point-normal equations of the line and plane. 


EXAMPLE 2 Point-Normal Equations + 


It follows from 2 that in 22 the equation 
6(х= 3) + (y+7)=0 
represents the line through the point (3, — 7) with normal n= (6, 1); and it follows from 3 that in д3 the equation 
4(x —3) + 2y -5(z— 7) =0 
represents the plane through the point (3, 0, 7) with normal n= (4, 2, — 5). 


(a, b, c) 


м 
2 
è 


—————— og 
" 


Рх. Yo. 20) y 


Figure 3.3.1 


When convenient, the terms in Equations 2 and 3 can be multiplied out and the constants combined. This leads to the following 
theorem. 


THEOREM 3.3.1 


(a) If a and b are constants that are not both zero, then an equation of the form 


ax+by +с= 0 (4) 


represents a line іп R? with normal n = (a, b). 


(b) If a, b, and c are constants that are not all zero, then an equation of the form 
ax --by -- ez -- d = 0 (5) 


represents a plane іп R? with normal n = (a, b, c). 


EXAMPLE 3 Vectors Orthogonal to Lines and Planes Through the Origin — 


(a) The equation ax --by = 0 represents a line through the origin іп 22. Show that the vector ny = (a, b) formed 
from the coefficients of the equation is orthogonal to the line, that is, orthogonal to every vector along the line. 
(b) The equation ax --by + cz = 0 represents a plane through the origin in R3. Show that the vector nz = (a, b, c) 


formed from the coefficients of the equation is orthogonal to the plane, that is, orthogonal to every vector that 
lies in the plane. 


Solution We will solve both problems together. The two equations can be written as 
(a,b)-(x, y) 20 and (a,b,c): (x,y,z) —0 
or, alternatively, as 
nj:(x,y)—0O0 and ng: (x,y,z) = 0 
These equations show that nj is orthogonal to every vector (x, у) on the line and that n is orthogonal to every 
vector (x, y, z) in the plane (Figure 3.3.1). 


Recall that 
ax+by=0 and ax--by-Fez—Ü 


are called homogeneous equations. Example 3 illustrates that homogeneous equations in two or three unknowns can be written in 
the vector form 


n-x=0 (6) 


where n is the vector of coefficients and x is the vector of unknowns. In 22 this is called the vector form of a line through the 
origin, and in 23 it is called the vector form of a plane through the origin. 


Referring to Table 1 of Section 3.2, in what other ways 
can you write 6 if n and x are expressed in matrix form? 


Orthogonal Projections 


In many applications it is necessary to “decompose” a vector u into a sum of two terms, one term being a scalar multiple of a 
specified nonzero vector a and the other term being orthogonal to a. For example, if u and a are vectors in 22 that are positioned 


so their initial points coincide at a point Q, then we can create such a decomposition as follows (Figure 3.3.2): 
* Drop a perpendicular from the tip of u to the line through a. 


e Construct the vector №] from О to the foot of the perpendicular. 


* Construct the vector Wz = u = w|. 


Q a Q 
(a) (b) 


(d) 


Figure 3.3.2 In parts (b) through (d), u = wy + W2, where W is parallel to a and W3 is orthogonal to a. 


Since 
wy Ел = + (0 = г) =u 


we have decomposed и into a sum of two orthogonal vectors, the first term being a scalar multiple of a and the second being 
orthogonal to a. 


The following theorem shows that the foregoing results, which we illustrated using vectors in 22, apply as well in R”. 


THEOREM 3.3.2 Projection Theorem 


If u and a are vectors їп A", and ifa « 0, then и can be expressed in exactly one way in the form u = wy ++ w2, where 


W| is a scalar multiple of a and м2 is orthogonal to a. 


Proof Since the vector Ww, is to be a scalar multiple of a, it must have the form 


wi —ka (7) 
Our goal is to find a value of the scalar k and a vector м2 that is orthogonal to a such that 
О = ү] +W? (8) 


We can determine k by using 7 to rewrite 8 as 
у= +w = ka + w3 


and then applying Theorems 3.2.2 and 3.2.3 to obtain 
u- a= (ka +w) - a—Kl[all? + (жз: а) (9) 


Since W3 15 to be orthogonal to a, the last term in 9 must be 0, and hence k must satisfy the equation 
u:a-—K|| all? 
from which we obtain 
k - u'a 
2 
llall 
as the only possible value for k. The proof can be completed by rewriting 8 as 


u'a 
|2 


м = 0 му = и – Ка = 0 – а 


а 
and then confirming that W732 is orthogonal to a by showing that w3 + a = 0 (we leave the details for you). 
The vectors Wj апа W3 in the Projection Theorem have associated names—the vector W4 is called the orthogonal projection of u 


оп а or sometimes the vector component of u along a, and the vector #2 is called the vector component of u orthogonal to a. The 
vector №] is commonly denoted by the symbol proj,u, in which case it follows from 8 that w; = u — proj,u. In summary, 


projgu = ue a a (vector component of u along a) (10) 
llall 

u—proju—u- u: a a (vector component of u orthogonal to a) (11) 
|а|| 


EXAMPLE 4 Orthogonal Projection on a Line + 


Find the orthogonal projections of the vectors ey = (1, 0) and e3 = (0, 1) on the line Z that makes an angle Ө with 
the positive x-axis in 22. 


Solution As illustrated in Figure 3.3.3, а = (cos Ø, sin Ө) is a unit vector along the line L, so our first problem is 
to find the orthogonal projection of €; along a. Since 


lall = y 020 + cos?8 — 1 and ej:a-— (1, 0) * (cos б, sin 8) = cos 
it follows from Formula 10 that this projection is 


projgey = “lta = (cos #)(cos B, sin В) = (cos, sin cos ө) 
llall 


Similarly, since ел. а = (0, 1) · (cos Ø, sin 8) = sin Ø, it follows from Formula 10 that 


projge? = 2224 = (sin B) (cos B, sin 9) = (sin B, cos ТҮ) 
а 


EXAMPLE 5 Vector Component of u Alonga <4 


Letu= (2, — 1, 3) and a — (4, — 1, 2). Find the vector component of u along a and the vector component of u 
orthogonal to a. 


Solution 
ua =(2)(4}+(—1)(—1)+(3)(2) = 15 
lal? =4°+(-1} +2 =21 

Thus the vector component of u along a is 


осла 15,1 _ (20 5 10 
proju = "os >] (4, ,2-(5 0T | 


and the vector component of и orthogonal to a is 


u—proju— (2, — 1,3) – Ge -3, 1 )= -$ -2, 7) 


As a check, you may wish to verify that the vectors и — proj,u and a are perpendicular by showing that their dot 
product is zero. 


Figure 3.3.3 


Sometimes we will be more interested in the norm of the vector component of и along a than in the vector component itself. A 
formula for this norm can be derived as follows: 


; uʻa uʻa usa 
Iiprojgul] = | 22 а = Plat = La] 
lal lal ЇЇ 


where the second equality follows from part (c) of Theorem 3.2.1 and the third from the fact that |а| 2 0. Thus, 


А ura 
llprojgul] = ae (12) 


If Ө denotes the angle between u and a, then u - a = [|щ||||а|| cos б, so 12 can also be written as 
llprojaull = ||ul||cos 6] (13) 


(Verify.) A geometric interpretation of this result is given in Figure 3.3.4. 


| cos 8 
(a) 0=0< 2 


llul] cos 8 


(b) Т <бё<т 


Figure 3.3.4 


The Theorem of Pythagoras 


In Section 3.2 we found that many theorems about vectors in д2 and R? also hold in R”. Another example of this is the following 
generalization of the Theorem of Pythagoras (Figure 3.3.5). 


THEOREM 3.3.3 Theorem of Pythagoras in R" 


If u and y are orthogonal vectors in R” with the Euclidean inner product, then 


2 2 2 
lu + vl" = lall + 151] (14) 


Proof Since u and v are orthogonal, we have үү. y = Q, from which it follows that 


а + vl? = (аж) (u+v) = lull? + 2(u - v) + lu]? + Liv]? 


EXAMPLE 6 Theorem of Pythagoras in к^ 4 
We showed in Example 1 that the vectors 
u—í(—2,3,1,4) and v—(1,2,0, —1) 

are orthogonal. Verify the Theorem of Pythagoras for these vectors. 
Solution We leave it for you to confirm that 

u--v—í(—1,5,1,3) 

lu + vii? = 36 

lali? + Ivl =30 + 6 
Thus, |а + vl? = |hull? + Iivil? 


uty 


Figure 3.3.5 


OPTIONAL 
Distance Problems 


We will now show how orthogonal projections can be used to solve the following three distance problems: 
Problem 1. Find the distance between a point and a line in 22. 
Problem 2. Find the distance between a point and a plane in 23. 


Problem 3. Find the distance between two parallel planes in g3. 


A method for solving the first two problems is provided by the next theorem. Since the proofs of the two parts are similar, we will 
prove part (b) and leave part (a) as an exercise. 


THEOREM 3.3.4 


(a) In д2 the distance D between the point Ро(хо, yp) and the line gx + by +e = 0 is 


_ ехо - уо te} 
D= PET: (15) 
үа +b 


(b) In R? the distance D between the point Po(xo, yo. 20) and the plane gx + by + cz + = 0 is 


D= ахо + byg + с20 + d| 


16 
a? 4- b 4. c? (0) 


Proof (b) Let Q(x,, уу, 21) be any point in the plane. Position the normal п = (a, b, c) so that its initial point is at О. As 
illustrated in Figure 3.3.6, the distance D is equal to the length of the orthogonal projection of OP, on n. Thus, it follows from 
Formula 12 that 


- Qf, n 
D = |prot, OPa] = 
Ilp ne oll TT 
But 
ОРр = (xg = x1. Y0—J1.20 —z1) 
QPy:n—a(xg — x1) ++ b(yg — y1) - c(zo —21) 
lnl] = ya? +b? +e? 
Thus 


_ (xo х1) - 0o —»1) +e —21)| 
е 17 
y 21 213424 (17) 
a+ b*+e 
Since the point O(x1, y, z,) lies in the given plane, its coordinates satisfy the equation of that plane; thus 
ax, рур 62] +a = 0 
ог 
d= —аху—®уу—с2| 
Substituting this expression in 17 yields 16. 


EXAMPLE 7 Distance Between a Point and a Plane + 
Find the distance D between the point (1, — 4, — 3) and the plane 2x — 3y + 6z = — 1. 


Solution Since the distance formulas in Theorem 3.3.4 require that the equations of the line and plane be written 
with zero on the right side, we first need to rewrite the equation of the plane as 


2x —3y + 6+1=0 
from which we obtain 
D |2(1)++(—3)(—4)+6(—3) + 1| = —3 =3 


27 4. (3)? 4.62 7 


n 
Po хо, Yor zo) 


proj, ОР, 4 


Figure 3.3.6 


The third distance problem posed above is to find the distance between two parallel planes in 22. As suggested in Figure 3.3.7, the 


distance between a plane V and a plane W can be obtained by finding any point Py in one of the planes, and computing the 
distance between that point and the other plane. Here is an example. 


үу 


Figure 3.3.7 Тһе distance between the parallel planes V апа W is equal to the distance between Py and W. 


EXAMPLE 8 Distance Between Parallel Planes + 


The planes 

x + 2y = 2z = 3 and 2x + 4y —4z = 7 
are parallel since their normals, (1, 2, — 2) and (2, 4, — 4), are parallel vectors. Find the distance between these 
planes. 


Solution То find the distance D between the planes, we can select an arbitrary point in one of the planes and 
compute its distance to the other plane. By setting y = z = 0 in the equation x + 2y — 2z = 3, we obtain the point 
Ро(3, 0, 0) in this plane. From 16, the distance between Pg and the plane 2x + 4y — 4z = 7 is 


2(3) +4(0) + (= 4)(0) —7 1 
D= 6 
T m 


Concept Review 


° 


° 


e 


° 


Orthogonal (perpendicular) vectors 
Orthogonal set of vectors 

Normal to a line 

Normal to a plane 

Point-normal equations 

Vector form of a line 

Vector form of a plane 

Orthogonal projection of u on a 
Vector component of u along a 

Vector component of u orthogonal to a 


Theorem of Pythagoras 


Skills 


Determine whether two vectors are orthogonal. 

Determine whether a given set of vectors forms an orthogonal set. 

Find equations for lines (or planes) by using a normal vector and a point on the line (or plane). 
Find the vector form of a line or plane through the origin. 


Compute the vector component of u along a and orthogonal to a. 


* Find the distance between a point and a line їп 22 or д2. 
* Find the distance between two parallel planes in 23. 


* Find the distance between a point and a plane. 


Exercise Set 3.3 


In Exercises 1—2, determine whether u and v are orthogonal vectors. 


1. (а) u= (6,1,4), v= (2, 0, – 3) 
(b) u= (0, 0, — 1), v— (1, 1, 1) 
(c) ч=(—6,0,4), v= (3, 1, 6) 
(d) u= (2,4, — 8), v= (5, 3,7) 


Answer: 


(a) Orthogonal 

(b) Not orthogonal 
(c) Not orthogonal 
(d) Not orthogonal 


2. (aj u= (2, 3), v= (5, —7) 
(b) u=(—6, —2), v= (4,0) 
(с) u— (1, – 5,4), v= (3, 3, 3) 
(d) u=(—2, 2, 3), v= (1,7, —4) 


In Exercises 3-4, determine whether the vectors form an orthogonal set. 


3. (а) v = (2, 3), уз = (3,2) 
(b vi (71, 1), v2 — (1, 1) 
(с) vy = (=2, 1, 1), ғ = (1, 0, 2), vg = (= 2, —5,1) 
(d vy = (=—3,4, = 1), ғ = (1, 2, 5), va = (4, = 3, 0) 


Answer: 


(a) Not an orthogonal set 
(b) Orthogonal set 
(c) Orthogonal set 
(d) Not an orthogonal set 


*@ у = (2,3), ¥2=(—3, 2) 
(b) vi = (1, 72, v9 =(—2, 1) 
(с) vi = (1, 0, 1), жз = (1, 1, 1), жз = (— 1, 0, 1) 
(d) v = (2, = 2, 1), v2 = (2, 1, – 2), v3 = (1, 2, 2) 
5. Find a unit vector that is orthogonal to both u = (1, 0, 1) and v = (0, 1, 1). 


Answer: 


БЕТЕ 


6. (а) Show that v = (a, b) and w= ( — b, a) are orthogonal vectors. 


(b) Use the result in part (a) to find two vectors that are orthogonal to v — (2, — 3). 


(c) Find two unit vectors that are orthogonal to ( — 5, 4). 


7. Do the points 4(1, 1, 1), 8( — 2, 0, 3), and C( — 3, — 1, 1) form the vertices of a right triangle? Explain your answer. 


Answer: 


Yes 


8. Repeat Exercise 7 for the points 4(3, 0, 2), B(4, 3, 0), and C(8, 1, — 1). 


In Exercises 9—12, find a point-normal form of the equation of the plane passing through P and having n as a normal. 


9,P(—1,3, – 2); n= (22,1, – 1) 
Answer: 


-2(x-1)*(y-3)—(z-2) 20 
10. P(1, 1,4); n= (1, 9, 8) 
11. (2, 0, 0); n= (0, 0, 2) 

Answer: 


22= 0 
12. P(0, 0, 0); n= (1, 2, 3) 


In Exercises 13—16, determine whether the given planes are parallel. 


13. 4x — y + 2z = 5 and 7x — 3y -- 42 = 8 


Answer: 

Not parallel 
14. x —4y - 3z — 2 = and 3x — 12y – 92-7 = 0 
15. 2y = 8x = 42 + Sand x= 52+ 1 

Answer: 

Parallel 


16.(—4,1,2)* (х, y, z) = 0 and (8, 22, —4)- (x, y, z) =0 
In Exercises 17-18, determine whether the given planes are perpendicular. 
17. 3x =y +z-4=0, x --2z— — 1 
Answer: 
Not perpendicular 
18. x = 2y + 3z=4, —2x+5y+4z= = 1 
In Exercises 19-20, find ||projgull. 


19. (3) u=(1, – 2), a—(—4, —3) 
(b) u= (3, 0, 4), a= (2, 3, 3) 


20. (ау u= (5, 6), a= (2, — 1) 

(b) u= (3, —2, 6), а= (1,2, —7) 
In Exercises 21—28, find the vector component of u along a and the vector component of u orthogonal to a. 
21. u — (6, 2), a= (3, – 9) 

Answer: 


(0, 0) (6, 2) 
22.1=(-1, 22), a=(—2, 3) 
23.u — (3, 1, — 7), a= (1, 0, 5) 


Answer: 


24, u — (1,0, 0), a= (4, 3, 8) 
25. u— (1, 1, 1),a— (0,2, — 1) 


Answer: 
2 _1 3 6 
62-0029 
26, u= (2,0, 1), а= (1, 2, 3) 
27.0 = (2, 1, 1, 2), а= (4, —4, 2, = 2) 


Answer: 
l cd b lobis 3 2 £l 
Э; 5° 10° 103° 15° 5° 10° 10 
28. = (5,0, = 3, 7), а= (2,1, 21, 21) 
In Exercises 29—32, find the distance between the point and the line. 
29. 4х + Зу +4=0; (— 5, 1) 
Answer: 


1 
30. x -3y +2=0; (= 1, 4) 
31. y = —4x + 2; (2, — 5) 


Answer: 


1 


y17 
32. 3x + y = 5; (1, 8) 


In Exercises 33—36, find the distance between the point and the plane. 


33. (3, 1, —2); x --2y —2z=4 
Answer: 


5 


3 
34. (= 1, = 1, 2); 2x + 5y — 6z = 4 
35. ( = 1, 2, 1); 2x + 3y — 42 = 1 


Answer: 


{29 
36. (0, 3, = 2); x —y -z—-3 


In Exercises 37-40, find the distance between the given parallel planes. 


37. 2x = y =z = 5 and —4x + 2y + 2z = 12 
Answer: 


1 


V6 
38. Зх — 4y +z = 1 and 6x — 8y + 22 = 3 
39. —4x + y — 3z = 0 and 8x — 2y + 6z=0 


Answer: 


0 (The planes coincide.) 
40. 2x — y +z=1and2x—y+4+z= – 1 


41. Let i, j, and k be unit vectors along the positive x, у, and z axes of a rectangular coordinate system in 3-space. If v = (a, Ё, c) 
is a nonzero vector, then the angles a, В, and y between v and the vectors i, j, and К, respectively, are called the direction 
angles of v (Figure Ex-41), and the numbers cos œ, cos 8, and Cos ¥ are called the direction cosines of v. 


(a) Show that cos а= a / ||v ||. 
(b) Find cos З and cos 7. 
(c) Show that v / ||v|| = (cos a, cos 8, cos y). 


(d) Show that созге + cos?g + cos? 


y= 1. 


Figure Ex-41 


Answer: 


(b) cos 8 = 39 cos y= 
vil 


42. Use the result in Exercise 41 to estimate, to the nearest degree, the angles that a diagonal of a box with dimensions 


Lt. 
[|| 


10 em x 15 em x 25 cm makes with the edges of the box. 
43. Show that if v is orthogonal to both w, and 2, then v is orthogonal to үм + 772 for all scalars ky and ёз. 


44. Let u and v be nonzero vectors in 2- or 3-space, and let & = ||u|| and ? = ||v||. Show that the vector w = ju + &v bisects the 
angle between u and v. 


45. Prove part (a) of Theorem 3.3.4. 


46. Is it possible to have 
projau = proj4a ? 


Explain your reasoning. 
True-False Exercises 
In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 
(a) The vectors (3, — 1, 2) and (0, 0, 0) are orthogonal. 

Answer: 


True 


(b) If u and v are orthogonal vectors, then for all nonzero scalars k and m, {ду and зуу are orthogonal vectors. 
Answer: 


True 


(c) The orthogonal projection of u along a is perpendicular to the vector component of u orthogonal to a. 
Answer: 


True 
(d) If a and b are orthogonal vectors, then for every nonzero vector u, we have 
proja(proj,(u)) = 0 


Answer: 


True 
(e) If a and u are nonzero vectors, then 
proja(proj,(u)) = proja (u) 


Answer: 


True 
(f) If the relationship 
projgl = projv 


holds for some nonzero vector a, then y — y. 
Answer: 


False 


(g) For all vectors u and v, it is true that 
о + |1 = [lul] + Iv 


Answer: 


False 
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3.4 The Geometry of Linear Systems 


In this section we will use parametric and vector methods to study general systems of linear equations. This work will enable us to interpret 
solution sets of linear systems with n unknowns as geometric objects in 2” just as we interpreted solution sets of linear systems with two 
and three unknowns as points, lines, and planes in 52 and 23. 


Vector and Parametric Equations of Lines in R? and ЕЗ 


In the last section we derived equations of lines and planes that are determined by a point and a normal vector. However, there are other 
useful ways of specifying lines and planes. For example, a unique line in д2 or R? is determined by a point Xg on the line and a nonzero 
vector v parallel to the line, and a unique plane in 22 is determined by a point xy in the plane and two noncollinear vectors v1 and v2 
parallel to the plane. The best way to visualize this is to translate the vectors so their initial points are at хр (Figure 3.4.1). 


Figure 3.4.1 


Let us begin by deriving an equation for the line Z that contains the point хо and is parallel to v. If x is a general point on such a line, then, 
as illustrated in Figure 3.4.2, the vector X — xg will be some scalar multiple of v, say 
x — хр = fv or equivalently x = xg + £v 


As the variable t (called a parameter) varies from су to со, the point x traces out the line L. Accordingly, we have the following result. 


THEOREM 3.4.1 


Let L be the line in g? or RÊ that contains the point xg and is parallel to the nonzero vector y. Then the equation of the line through 


Xp that is parallel to y is 
х= ху ѓу (1) 
If xy = 0, then the line passes through the origin and the equation has the form 


x=iv (2) 


Although it is not stated explicitly, it is understood in 
Formulas 1 and 2 that the parameter ¢ varies from со to o. 
This applies to all vector and parametric equations in this text 
except where stated otherwise. 


Figure 3.4.2 


Vector and Parametric Equations of Planes in R? 


Next we will derive an equation for the plane W that contains the point хү and is parallel to the noncollinear vectors V1 and V2. As shown in 
Figure 3.4.3, if x is any point in the plane, then by forming suitable scalar multiples of V1 and V2, say ѓу and £2v2, we can create a 
parallelogram with diagonal X — Xg and adjacent sides tv, and Фууу. Thus, we have 

X — Xj = £1V1 + £2v3 or equivalently x = xg + рур + £2v2 


Figure 3.4.3 


As the variables £4 and £4 (called parameters) vary independently from со to со, the point x varies over the entire plane W. Accordingly, 


we make the following definition. 


THEOREM 3.4.2 


Let W be the plane in д3 that contains the point xg and is parallel to the noncollinear vectors V1 and v2. Then an equation of the 


plane through хо that is parallel to V1 and V2 is given by 


X — хуруу + t2v2 (3) 


If xg = 0, then the plane passes through the origin and the equation has the form 


X —£(V1 + £2V2 (4) 


Remark Observe that the line through хо represented by Equation 1 is the translation by xg of the line through the origin represented by 
Equation 2 and that the plane through хо represented by Equation 3 is the translation by xg of the plane through the origin represented by 


Equation 4 (Figure 3.4.4). 


X = Xo fV, + LV, 


Figure 3.4.4 


Motivated by the forms of Formulas 1 to 4, we can extend the notions of line and plane to R" by making the following definitions. 


DEFINITION 1 


If xo and v are vectors in R”, and if v is nonzero, then the equation 
х= ху ѓу (5) 


defines the line through Хо that is parallel to y. In the special case where xg = 0, the line is said to pass through the origin. 


DEFINITION 2 


If xg, v1, and уз are vectors in Ж”, and if v, and v2 are not collinear, then the equation 
X — Xp + £1V1 + £2V2 (6) 


defines the plane through Xg that is parallel to V] and V2. In the special case where xg = 0, the plane is said to pass through the 
origin. 


Equations 5 and 6 are called vector forms of a line and plane in R”. If the vectors in these equations are expressed in terms of their 
components and the corresponding components on each side are equated, then the resulting equations are called parametric equations of 
the line and plane. Here are some examples. 


EXAMPLE 1 Vector and Parametric Equations of Lines in R^andR? <4 


(а) Find a vector equation and parametric equations of the line in & that passes through the origin and is parallel to the 
vector v — ( — 2, 3). 

(b) Find a vector equation and parametric equations of the line in &? that passes through the point Ppg(1, 2, = 3) and is 
parallel to the vector v — (4, — 5, 1). 


(c) Use the vector equation obtained in part (b) to find two points on the line that are different from Pp. 


Solution 
(a) It follows from 5 with xg = 0 that a vector equation of the line is x = £y. If we let x = (x, y), then this equation can be 
expressed in vector form as 
(x, y) —1(— 2, 3) 


Equating corresponding components on the two sides of this equation yields the parametric equations 


x= —2t, у= 3t 


(b) It follows from 5 that a vector equation of the line is x = xg + £v. If we let x = (x, y, 2), and if we take 
xg = (1, 2, — 3), then this equation can be expressed in vector form as 


(х, у,2) = (1,2, —3) --£(4, —5, 1) (7) 


Equating corresponding components on the two sides of this equation yields the parametric equations 
х=1+4,‚ у=2- 5, 2= —3-t 


(c) A point on ће line represented by Equation 7 can be obtained by substituting a specific numerical value for ће 
parameter /. However, since ¢ = Q produces (x, y, 2) = (1, 2, — 3), which is the point Ро, this value of ¢ does not serve 
our purpose. Taking ғ — | produces the point (5, — 3, — 2) and taking ; = — ] produces the point ( — 5, 7, — 4). Any 
other distinct values for t (except ; — 0) would work just as well. 


EXAMPLE 2 Vector and Parametric Equations of a Plane in R 4 


Find vector and parametric equations of the plane x — y + 22 = 5. 


Solution We will find the parametric equations first. We can do this by solving the equation for any one of the variables in 
terms of the other two and then using those two variables as parameters. For example, solving for x in terms of y and z yields 


x—54y-2z (8) 


and then using y and z as parameters £ and #3, respectively, yields the parametric equations 
x= 5+ f] 20), y —t1, z — £3 


We would have obtained different parametric and 
vector equations in Example 2 had we solved 8 for y or 
z rather than x. However, one can show the same plane 
results in all three cases as the parameters vary from 
= t0 po. 


To obtain a vector equation of the plane we rewrite these parametric equations as 
(x, y, z) = (5 + t1 — 262, £1, £2) 
or, equivalently, as 


(х, у,2) = (5, 0, 0) + £4(1, 1, 0) + £2(— 2, 0, 1) 


EXAMPLE 3 Vector and Parametric Equations of Lines and Planes in к^ 4 


(a) Find vector and parametric equations of the line through the origin of 54 that is parallel to the vector v = (5, — 3, 6, 1). 


(b) Find vector and parametric equations of the plane іп g4 that passes through the point xy = (2, — 1, 0, 3) and is parallel 
to both vy = (1, 5, 2, —4) and vz = (0, 7, = 8, 6). 


Solution 
(a) If we let x = (x1, x3, x3, x4), then the vector equation x — ry can be expressed as 
(x1, X2, X3, X4) —£(5, = 3,6, 1) 
Equating corresponding components yields the parametric equations 
xy=5t, x= —3t, хз=6, x4—t 


(b) The vector equation x = xg + ру + £2v2 can be expressed as 
(x1, x3, 33, 4) = (2, —1,0, 3) --£4(1, 5,2, —4) + £3(0, 7, —8,6) 


which yields the parametric equations 


х|=2+4| 
х2= = 1 + 54 + 242 
x3 = 2t, — 8t3 


хд=3 = 4 + 62 


Lines Through Two Points іп R” 


If Xy and Xj are distinct points in R”, then the line determined by these points is parallel to the vector V = X, = хо (Figure 3.4.5), so it 
follows from 5 that the line can be expressed in vector form as 


x= xp (ху — xg) (9) 
or, equivalently, as 
x= (1— )хо + fx (10) 
These are called the two-point vector equations of a line in R”. 
EXAMPLE 4 А Line Through Two Points in R 4 
Find vector and parametric equations for the line in 22 that passes through the points P(0, 7) and Q(5, 0). 


Solution We will see below that it does not matter which point we take to be Xj and which we take to be Xj, so let us 
choose xg = (0, 7) and xj = (5, 0). It follows that xy — xg = (5, — 7) and hence that 


(x, y) = (0,7) -- £(5, – 7) (11) 


which we can rewrite in parametric form as 
x=5t, у=71- 1 


Had we reversed our choices and taken xg = (5, 0) and x, = (0, 7), then the resulting vector equation would have been 
(x, y) = (5, 0) -£(—5,7) (12) 


and the parametric equations would have been 
x=5-5t,y=7t 
(verify). Although 11 and 12 look different, they both represent the line whose equation in rectangular coordinates is 
7x + 5у = 35 


(Figure 3.4.6). This can be seen by eliminating the parameter ¢ from the parametric equations (verify). 


Х| 


Figure 3.4.5 


7х + 5у 235 


Figure 3.4.6 


The point x — (x, y) in Equations 9 and 10 traces an entire line in R? as the parameter / varies over the interval ( — со, со). If, however, 
we restrict the parameter to vary from ¢ = () to ¢ = 1, then x will not trace the entire line but rather just the /ine segment joining the points 
xg and X4. The point x will start at Xy when ғ — Q and end at X; when ғ = 1]. Accordingly, we make the following definition. 


DEFINITION 3 


If Xg and х] are vectors in R”, then the equation 
x—xp--t(xi xo) (0<4< 1) (13) 
defines the line segment from хо to X]. When convenient, Equation 13 can be written as 


x= (1 = )хо + éx; (0<2< 1) (14) 


EXAMPLE 5 ALine Segment from One Point to Another in R < 


It follows from 13 and 14 that the line segment in R? from xg = (1, — 3) to xy = (5, 6) can be represented either by the 


equation 


x= (1, – 3) +4(4, 9) (0€ £« 1) 


or by 
х=(1—{4)(1, —3) 4-£(5,6) (0<¢<1) 


Dot Product Form of a Linear System 


Our next objective is to show how to express linear equations and linear systems in dot product notation. This will lead us to some 
important results about orthogonality and linear systems. 


Recall that a linear equation in the variables x1, x2, ..., Ху has the form 


аху + d2X2 +... духу =È (a4, a2, -.., Ay not all zero) (15) 


and that the corresponding homogeneous equation is 


а\х1 4952 +... + ax, = 0 (21,22, -.., ay not all zero) (16) 


These equations can be rewritten in vector form by letting 
a= (41,42, ....@,) and x — (x1, Х2,..„ Xy) 


in which case Formula 15 can be written as 

a:x—b (17) 
and Formula 16 as 

a:x—0 (18) 
Except for a notational change from n to a, Formula 18 is the extension to £” of Formula 6 in Section 3.3. This equation reveals that each 


solution vector x of a homogeneous equation is orthogonal to the coefficient vector a. To take this geometric observation a step further, 
consider the homogeneous system 


441X1 + а@12Х2 + ... + ах = 0 
4@21Х|\ + 42232 + ... + amXn = 0 
ах] + dy2X2 + ... + GyyxX, = 0 


If we denote the successive row vectors of the coefficient matrix by rj, rz, ..., Гу, then we can rewrite this system in dot product form as 


rex = 0 
чоту (19) 
гух = 0 


from which we see that every solution vector x is orthogonal to every row vector of the coefficient matrix. In summary, we have the 
following result. 


THEOREM 3.4.3 


If A is an р; x м matrix, then the solution set of the homogeneous linear system 4x — @ consists of all vectors in R” that are 
orthogonal to every row vector of A. 


EXAMPLE 6 Orthogonality of Row Vectors and Solution Vectors — 


We showed in Example 6 of Section 1.2 that the general solution of the homogeneous linear system 


X1 

13-2 02 0||Х2 0 

26 —5 —2 4 -3|z3| |0 

00 5 100 15||34| |0 

26 0 84 18||*5 0 
х6 

is 
хү= —3r—4s—2t, x3 =r, x3— = 25, хд=5, Хр, xg—0 


which we can rewrite in vector form as 
x—í(—3r—4s—2t,r, —2s,5, t, 0) 


According to Theorem 3.4.3, the vector x must be orthogonal to each of the row vectors 
ri = (1,3, —2, 0, 2, 0) 
гр = (2,6, —5, —2,4, —3) 
гз = (0, 0, 5, 10, 0, 15) 
r4— (2, 6, 0, 8, 4, 18) 


We will confirm that x is orthogonal to Г], and leave it for you to verify that x is orthogonal to the other three row vectors as 
well. The dot product of rj and x is 


гох 10 37 46 20) + 3(r) + (—2)( — 2s) + 0(s) + 2(£) + 0(0) =0 
which establishes the orthogonality. 


The Relationship Between Ax = 0 and Ax = b 


We will conclude this section by exploring the relationship between the solutions of a homogeneous linear system 4x — () and the solutions 
(if any) of a nonhomogeneous linear system Ах — h that has the same coefficient matrix. These are called corresponding linear systems. 


To motivate the result we are seeking, let us compare the solutions of the corresponding linear systems 


^1 1 
13-2 02 0/72 0 13-2 02 0/72 
2 6 —5 —2 4 3] *s[. T0 and 2 6 -5 —2 4 519853]. [1 
00 5 100 15]||*4 0 00 5 100 15]||*4 
26 0 8 4 18||*5 0 26 0 8 4 18||*5 6 
X6 X6 


We showed in Example 5 and Example 6 of Section 1.2 that the general solutions of these linear systems can be written in parametric form 
as 


homogeneous — x, = = 3r— 4s — 2f, x3 =r, x3— — 25, x4—5, x5—t, xg —0 
nonhomogeneous — x; = = 3r— 4s — 2t, x3 =r, x3— — 25, хд=5, x5—£, xg = i 
which we can then rewrite in vector form as 


homogeneous — (x1, X3, X3, X4, X5) = ( = 3r — 4s — 2t, r, — 2s, s, £, 0) 


3 


By splitting the vectors on the right apart and collecting terms with like parameters, we can rewrite these equations as 


nonhomogeneous — (х, X2, X3, X4, X5) = (- 3r—4s = 28, r, — 2s, S, f, i 


homogeneous — (x1, x3, X3, X4, X5) —7( — 3, 1, 0, 0, 0) +s(—4, 0, —2,1,0,0) --2(— 2, 0, 0, 0, 1, 0) (20) 
nonhomogeneous — (x1, X, х3, X4, x5) =^( —3, 1,0,0,0) 4-5( —4,0, —2, 1, 0, 0) + £( = 2, 0, 0, 0, 1,0) + (б. 0, 0, 0, 0, i (21) 


Formulas 20 and 21 reveal that each solution of the nonhomogeneous system can be obtained by adding the fixed vector (б. 0, 0, 0, 0, A 


to the corresponding solution of the homogeneous system. This is a special case of the following general result. 


THEOREM 3.4.4 


The general solution of a consistent linear system 4x = h can be obtained by adding any specific solution of 4x = h to the general 
solution of Ах = 0). 


Proof Let Xy be any specific solution of 4x = h, let W denote the solution set of 4x = Q, and let xg + W denote the set of all vectors that 
result by adding хо to each vector in W. We must show that if x is a vector in xg + W, then x is a solution of 4x — h, and conversely, that 
every solution of Ах = h is in the set xg + WF. 


Assume first that x is a vector in хо + W . This implies that x is expressible in the form x = xg +w, where Axg = b and 4 — 0. Thus, 
Ax = А(хр +w) = Axo + Aw=b+0=b 


which shows that х is a solution of 4x =h . 


Conversely, let x be any solution of 4x =h , To show that x is in the set xp + W we must show that x is expressible in the form 
х= хр +W (22) 


where w is in W (i.e., Æw = 0). We can do this by taking w= x — хо. This vector obviously satisfies 22, and it is in W since 
Aw = A(x = xg) = Ax — Аху=һ—һ=0 


Figure 3.4.7 The solution set of 4x = h is a translation of the solution space of Ах = 0. 


Remark Theorem 3.4.4 has a useful geometric interpretation that is illustrated in Figure 3.4.7. If, as discussed in Section 3.1, we interpret 
vector addition as translation, then the theorem states that if Xp is any specific solution of Ах — h, then the entire solution set of дұ — h can 
be obtained by translating the solution set of 4x — () by the vector xp . 


Concept Review 

* Parameters 

* Parametric equations of lines 

* Parametric equations of planes 

* Two-point vector equations of a line 

* Vector equation of a line 

* Vector equation of a plane 

Skills 

* Express the equations of lines in 22 and 2? using either vector or parametric equations. 

* Express the equations of planes in 2” using either vector or parametric equations. 

* Express the equation of a line containing two given points in д2 or R? using either vector or parametric equations. 
* Find equations of a line and a line segment. 

* Verify the orthogonality of the row vectors of a linear system of equations and a solution vector. 


* Use a specific solution to the nonhomogeneous linear system 4x = h and the general solution of the corresponding linear system 
Ax = Q to obtain the general solution to Ах = b. 


Exercise Set 3.4 
In Exercises 1—4, find vector and parametric equations of the line containing the point and parallel to the vector. 
1. Point: ( — 4, 1); vector: v= (0, — 8) 

Answer: 

Vector equation: (x, y) = (= 4, 1) + £(0, = 8); 


parametric equations: x — —4, y —1— 8t 


2. Point: (2, = 1); vector: v = (—4, —2) 
3. Point: (0, 0, 0); vector: v = ( — 3, 0, 1) 


Answer: 

Vector equation: (x, y, 2) —£( — 3, 0, 1); 

parametric equations: x — — 3t, y=0, z—t 
4. Point: ( — 9, 3, 4); vector: v = ( — 1, 6, 0) 
In Exercises 5-8, use the given equation of a line to find a point on the line and a vector parallel to the line. 
5,x= (3-54, —6—1) 

Answer: 


Point: (3, — 6); parallel vector: ( — 5, — 1) 


6. (x, y, Z) = (41, 7, 4 + 32) 
7,.x=(1=£)(4, 6) -£( —2,0) 


Answer: 


Point: (4, 6); parallel vector: (—6, = 6) 
в.х= (1-00, –5,1) 


In Exercises 9—12, find vector and parametric equations of ће plane containing the given point and parallel vectors. 
9. Point: ( — 3, 1, 0); vectors: v; = (0, — 3, 6) and v; = ( — 5, 1, 2) 
Answer: 


Vector equation: (x, y, z) = ( = 3, 1, 0) +64100, —3,6) +é9( — 5, 1, 2); 


parametric equations: x = —3— 515, y = 1 — 3£1-- £2, z= 64 + 223 
10. Point: (0, 6, — 2); vectors: vy = (0, 9, — 1) and v; = (0, = 3, 0) 
11. Point: ( — 1, 1, 4); vectors: vj = (6, —1, 0) and v3 = (— 1, 5, 1) 


Answer: 


Vector equation: (x, y, z) = (—1, 1,4) +41(6, — 1,0) 4-z2(— 1, 3, 1); 


parametric equations: x = — 1 + 6f; —f9, y = 1-4 + 313, z 2 4 + Éa 
12. Point: (0, 5, = 4); vectors: vy = (0, 0, —5) and v2 = (1, = 3, —2) 


In Exercises 13—14, find vector and parametric equations of the line in 22 that passes through the origin and is orthogonal to v. 
13.v— (72,3) 
Answer: 


A possible answer is vector equation: (x, у) = £(3, 2); 


parametric equations: x = 3£, y = 2t 
14. v — (1, 74) 
In Exercises 15—16, find vector and parametric equations of the plane in 23 that passes through the origin and is orthogonal to v. 


15. v = (4, 0, — 5) [Hint: Construct two nonparallel vectors orthogonal to v in 23]. 


Answer: 


A possible answer is vector equation: (x, y, 2) = £1(0, 1, 0) + £3(5, 0,4); 


parametric equations: x 4 542, у = £1, Z =4ї) 
16. = (3, 1, — 6) 


In Exercises 17—20, find the general solution to the linear system and confirm that the row vectors of the coefficient matrix are orthogonal 
to the solution vectors. 


17. Xj X34 x3-0 


2x1 + 2x2 + 2х3 = 0 
3x1 + 3x2 + 3х3 = 0 


Answer: 


x1 = —s—í, X2—8, X3 =É 
18. x1 + 3x2 – 4х3 = 0 
2x1 + x2 = 8x3 = 0 


19. х] + 5х3 + х3 + 2x4— х5 = 0 
хр = 2х2 = х3 + 3x4 + 2x5 = 0 


Answer: 
23 198 pd lid d = 2 = 
Soa 7 a 75 *1— 77+ 75+ at, X3—T, x4—8, X5-—í 


20. x1 + 3x2 — 4x3 = 0 
x1 + 2x2 + 3х3 = 0 


21. (a) The equation x -+ y +z = 1 can be viewed as a linear system of one equation in three unknowns. Express a general solution of this 
equation as a particular solution plus a general solution of the associated homogeneous system. 


(b) Give a geometric interpretation of the result in part (a). 


Answer: 


(a) (1, 0, 0) +s(—1, 1, 0) - £( — 1,0, 1) 
(b) a plane іп 22 passing through P(1, 0, 0) and parallel to ( = 1, 1, 0) and ( = 1, 0, 1) 


22. (a) The equation x + y = 1 can be viewed as a linear system of one equation in two unknowns. Express a general solution of this 
equation as a particular solution plus a general solution of the associated homogeneous system. 


(b) Give a geometric interpretation of the result in part (a). 


23. (a) Find a homogeneous linear system of two equations in three unknowns whose solution space consists of those vectors in RÊ that аге 
orthogonal to а = (1, 1, 1) and b = ( — 2, 3, 0). 
(b) What kind of geometric object is the solution space? 
(c) Find a general solution of the system obtained in part (a), and confirm that Theorem 3.4.3 holds. 


Answer: 
(a) X tW mz 
—2x + 3y 


(b) a line through the origin in R? 


(uU) uin dE pes os ЕЕЕ 
) x 55 y si z=í 


24. (a) Find a homogeneous linear system of two equations in three unknowns whose solution space consists of those vectors in R? that are 
orthogonal to а= ( — 3, 2, — 1) andb = (0, —2, —2). 
(b) What kind of geometric object is the solution space? 


(c) Find a general solution of the system obtained in part (a), and confirm that Theorem 3.4.3 holds. 


25. Consider the linear systems 


3 2 -l|[xi 0 
6 4 —2||х2|=|0 
—3 —2 1| *3 0 
апа 
3 2 —1|[^{ 2 
6 4 =2||x2|=| 4 
—3 —2 1 {| *3 —2 


(a) Find a general solution of the homogeneous system. 
(b) Confirm that x; = 1, xz = 0, x3 = 1 is a solution of the nonhomogeneous system. 
(c) Use the results in parts (a) and (b) to find a general solution of the nonhomogeneous system. 


(d) Check your result in part (c) by solving the nonhomogeneous system directly. 


Answer: 
2 1 
а. = = с oe — = = 
ХІ 351 3^ х2=5, X3—í 
2 1 
с. xy =] — 254+ - zz 
кү=1 35+ 3t. x2 5, хз= 1+1 
26. Consider the linear systems 
1 -2 -3|[xi 0 
2 1 4х2 |= 0 
1-7 5 [3 0 
апа 
1 -2 —3 [^1 2 
2 1 4||х2|=| 7 
1-7 5 [3 —1 


(a) Find a general solution of the homogeneous system. 
(b) Confirm that x; = 1, х2 = 1, х3 = 1 is a solution of the nonhomogeneous system. 
(c) Use the results in parts (a) and (b) to find a general solution of the nonhomogeneous system. 


(d) Check your result in part (c) by solving the nonhomogeneous system directly. 


In Exercises 27—28, find a general solution of the system, and use that solution to find a general solution of the associated homogeneous 
system and a particular solution of the given system. 


x 
Уз 41 2 3 
6 82 5 хз |= 7 
9 12 3 10 x4 13 
Answer 
х] = i = $s “ i X2—58, xX3—£, x4= 1; The general solution of the associated homogeneous system is 
х= = $e i. X3—58, X3—£, х4 = 0. A particular solution of the given system is x; = 1. x3—0, x3=0, х4= 1. 
x 
7879 23 5 6 is 4 
6—23 1 хз |= 5 
3 —1 3 14 x4 —8 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 


(a) The vector equation of a line can be determined from any point lying on the line and a nonzero vector parallel to the line. 
Answer: 


True 


(b) The vector equation of a plane can be determined from any point lying in the plane and a nonzero vector parallel to the plane. 
Answer: 


False 


(c) The points lying on a line through the origin in 22 or R? are all scalar multiples of any nonzero vector on the line. 


Answer: 


True 


(d) All solution vectors of the linear system 4x = h are orthogonal to the row vectors of the matrix A if and only if — 0. 
Answer: 


True 


(e) The general solution of the nonhomogeneous linear system Ах = h can be obtained by adding b to the general solution of the 
homogeneous linear system 4x = 0). 


Answer: 


False 


(f) If x1 and х2 are two solutions of the nonhomogeneous linear system 4x = h, then X — X7 is a solution of the corresponding 
homogeneous linear system. 


Answer: 


True 
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3.5 Cross Product 


This optional section is concerned with properties of vectors in 3-space that are important to physicists and 
engineers. It can be omitted, if desired, since subsequent sections do not depend on its content. Among other 
things, we define an operation that provides a way of constructing a vector in 3-space that is perpendicular to two 
given vectors, and we give a geometric interpretation of 3 x 3 determinants. 


Cross Product of Vectors 


In Section 3.2 we defined the dot product of two vectors u and v in n-space. That operation produced a scalar as its 
result. We will now define a type of vector multiplication that produces a vector as the result but which is 


applicable only to vectors in 3-space. 


DEFINITION 1 


Іо = (u,, 23, из) and v = (v, v3, v3) are vectors in 3-space, then the cross product y x y is the vector 
defined by 
ux v — (ugv3 — u3v2, u3v| —u]v3, u]v2 — u3V1) 


wxv-[ | " 


Remark Instead of memorizing 1, you can obtain the components of y x y as follows: 


or, in determinant notation, 


u2 ИЗ 
v2 УЗ 


и] 83 
vi УЗ 


и] u2 
УІ V2 


› Ы 


и] из иЗ 


у] v2 | whose first row contains the components of u and whose second row 


* Form the 2 x 3 matrix | 


contains the components of v. 


* To find the first component of y x y, delete the first column and take the determinant; to find the second 
component, delete the second column and take the negative of the determinant; and to find the third component, 
delete the third column and take the determinant. 


EXAMPLE 1 Calculating a Cross Product + 
Find y x y. where u = (1, 2, — 2) and v = (3, 0, 1). 


Solution From either 1 or the mnemonic in the preceding remark, we have 


( " -| j| 1 1) 


оху 


0 1 c 30 
(2, —7, —6) 


The following theorem gives some important relationships between the dot product and cross product and also 
shows that y x y is orthogonal to both u and v. 


Historical Note The cross product notation 4 х 5 was introduced by the American physicist and 
mathematician J. Willard Gibbs, (see p. 134) in a series of unpublished lecture notes for his students at Yale 
University. It appeared in a published work for the first time in the second edition of the book Vector 
Analysis, (Edwin Wilson) by Edwin Wilson (1879--1964), a student of Gibbs. Gibbs originally referred to 
Ax Bas the “skew product.” 


THEOREM 3.5.1 Relationships Involving Cross Product and Dot Product 


If u, v, and w are vectors in 3-space, then 
(а) u-(uxv)—0 (и x vis orthogonal to u) 


(b) u:(uxv)—0 (u x v is orthogonal to v) 


(c) lux vl? = 12112 — (u: v)? (Lagrange ' s identity) 
(d) ux(vxw)-—(u:w)v—(u:v)w (relationship between cross and dot products) 


(2) (uxv)xw-(u:w)v—(v:w)u (relationship between cross and dot products) 


Proof (а) Letu = (1, u3, u3) and v = (у, v2, уз). Then 


u:(uxv) = (uy, 42, u3) + (u3v3 — u3v2, UV] — u1v3, Ш]У2 — u3v1) 


= u1(ugv3 = изу) + ug(uav1 = u1v3) + изи цул — илуу) = 0 


Proof (b) Similar to (a). 


Proof (c) Since 


2 2 2 2 
u x v||* = (u2v3 —13v2)" + (изу — нуз)” + (u1v2 — uzv1) (2) 
and 
$i 2 Qo. o. ME 2,2,2 2 
lal lvi? — (u - v) = (uj ud Hug [vi và và] — Gavi + uzva +4303) (3) 


the proof сап be completed by “multiplying out" the right sides of 2 and 3 and verifying their equality. 


Proof (d) and (e) See Exercises 38 and 39. 


EXAMPLE 2 ux v Is Perpendicular to и andtov < 


Consider the vectors 
u—í(1,2, -2) and v—(3,0, 1) 
In Example 1 we showed that 
uxv=(2, —7, —6) 
Since 
u: (uxv) = (1)(2) + (2(—7)+(—2)(—6)=0 
and 
v:(uxv) = (3)(2) + (0)(= 7) + (1) (— 6) 20 


u x y 15 orthogonal to both u and v, as guaranteed by Theorem 3.5.1. 


Joseph Louis Lagrange (1736-1813) 


Historical Note Joseph Louis Lagrange was a French-Italian mathematician and astronomer. Although 
his father wanted him to become a lawyer, Lagrange was attracted to mathematics and astronomy after 
reading a memoir by the astronomer Halley. At age 16 he began to study mathematics on his own and by 
age 19 was appointed to a professorship at the Royal Artillery School in Turin. The following year he 
solved some famous problems using new methods that eventually blossomed into a branch of mathematics 
called the calculus of variations. These methods and Lagrange's applications of them to problems in 
celestial mechanics were so monumental that by age 25 he was regarded by many of his contemporaries as 
the greatest living mathematician. One of Lagrange's most famous works is a memoir, Mécanique 
Analytique, in which he reduced the theory of mechanics to a few general formulas from which all other 
necessary equations could be derived. Napoleon was a great admirer of Lagrange and showered him with 
many honors. In spite of his fame, Lagrange was a shy and modest man. On his death, he was buried with 
honor in the Pantheon. 

Umage: OSSPL/The Image Works] 


The main arithmetic properties of the cross product are listed in the next theorem. 


THEOREM 3.5.2 Properties of Cross Product 


If u, v, and w are any vectors in 3-space and k is any scalar, then: 
(а) ux v — —(vxu) 

(b) ux (жм) = (ux v) + (uxw) 

(c) (atv) xw= (uxw) + (v xw) 

(d) (оху) = (ku) ху = их (kv) 

(e) ux0=O0xu=0 

() uxu=0 


The proofs follow immediately from Formula 1 and properties of determinants; for example, part (a) can be proved 
as follows. 


Proof (a) Interchanging u and v in 1 interchanges the rows of the three determinants on the right side of 1 and 
hence changes the sign of each component in the cross product. Thus u x v — — (v x u). 


The proofs of the remaining parts are left as exercises. 
EXAMPLE 3 Standard Unit Vectors + 


Consider the vectors 
1= (1,0,0), ј= (0, 1,0), k-(0,0,1) 


These vectors each have length 1 and lie along the coordinate axes (Figure 3.5.1). They are called the 
standard unit vectors in 3-space. Every vector v = (ү, v3, v3) in 3-space is expressible in terms of 
i, j, and k since we can write 


v= (v1, v3, V3) — v1(1, 0, 0) + v3(0, 1, 0) + v3(0, 0, 1) = vii + vaj + v3k 
For example, 
(2, = 3, 4) = 2i— 3] + 4k 


From 1 we obtain 


IET 


0 1 


(0. 1, 0) 
(1, 0, 0) 


Figure 3.5.1 The standard unit vectors 


You should have no trouble obtaining the following results: 


ixi=0 jxj=0 kxk=0 
ixj=k jxk=i kxi=j 
jxi=-k  kxj--i  ixk- -j 


Figure 3.5.2 is helpful for remembering these results. Referring to this diagram, the cross product of two 


consecutive vectors going clockwise is the next vector around, and the cross product of two consecutive vectors 
going counterclockwise 1s the negative of the next vector around. 


i 


Figure 3.5.2 


Determinant Form of Cross Product 


It is also worth noting that a cross product can be represented symbolically in the form 


AM. из 43), |01 Из, |21 2 
uxv-—j|uj иу из|= v; vaty уз | vo 
Vi v2 УЗ 


For example, ifu = (1, 2, — 2) and v = (5, 0, 1), then 


which agrees with the result obtained in Example 1. 


WARNING 


It is not true in general that u x (v x w) — (u x v) x w. For example, 


іх (jxj) =ix0=0 
and 
(ixj) xj=kxj= =i 
so 
іх (х) + (1х)) xj 


We know from Theorem 3.5.1 that y х y is orthogonal to both и and v. If u and v are nonzero vectors, it can be 


shown that the direction of y x y can be determined using the following “right-hand rule" (Figure 3.5.3): Let 0 be 


the angle between и and v, and suppose и is rotated through the angle 0 until it coincides with v. If the fingers of 
the right hand are cupped so that they point in the direction of rotation, then the thumb indicates (roughly) the 
direction of y x у. 


пху 


2 


Figure 3.5.3 
You may find it instructive to practice this rule with the products 
ixj=k, jxk-i kxi-j 
Geometric Interpretation of Cross Product 


If u and v are vectors in 3-space, then the norm of y x y has a useful geometric interpretation. Lagrange's identity, 
given in Theorem 3.5.1, states that 


2 
[а ху = [а у? (о-у) (5) 


If 0 denotes the angle between u and v, then u - v = ||u||||v||cos 8, so 5 can be rewritten as 


2 2 2 2 2 2 
|uxvi? = lall]? — lfull?IIvil?cos?6 
2 2 2 
= |ч Зу (1 — cos ø) 
2 ao 2 
= 12111252220 


Since 0 < @ < т, it follows that sin 6 > 0, so this can be rewritten as 
[ч x v|| = [ull l|v]|sin 8 (6) 


But ||v||sin @ is the altitude of the parallelogram determined by u and v (Figure 3.5.4). Thus, from 6, the area A of 
this parallelogram is given by 
A= (base) (altitude) = |[|u||||v|| sin 9 = [о x v || 


This result is even correct if u and v are collinear, since the parallelogram determined by u and v has zero area and 
from 6 we have y x y = () because @ — Q in this case. Thus we have the following theorem. 


THEOREM 3.5.3 Area of a Parallelogram 


If, u and v are vectors in 3-space, then ||u x v|| is equal to the area of the parallelogram determined by u 
and v. 


EXAMPLE 4 Area ofa Triangle <4 
Find the area of the triangle determined by the points #4 (2, 2, 0), P5( — 1, 0, 2), and P3(0, 4, 3). 


Solution The area A of the triangle is i the area of the parallelogram determined by the vectors 
P,P, and pP, P (Figure 3.5.5). Using the method discussed in Example 1 of Section 3.1, 
—— —_ 
PPa = (-3, —2, 2) and P,P; = (—2, 2, 3). It follows that 
— = 
PiP x PiP = (—10, 5, = 10) 


(verify) and consequently that 


= 1) PP) x PiP = 1(15) = 12 
A= S|P1P2x P4173] = 5 (152 = 
2 2 2 
DEFINITION 2 
If u, v, and w are vectors in 3-space, then 
u: (уху) 
is called the scalar triple product of u, v, and w. 
Figure 3.5.4 
P{-1, 0, 2) P40,4,3 


P,(2, 2, 0) 


Figure 3.5.5 


The scalar triple product of u = (14, #2, #3), V = (v1, v2, УЗ), and w= (м, W2, w3) can be calculated from the 
formula 
м] из из 


u: (уху) = |У] V2 УЗ (7) 
У] Wj3 W3 


This follows from Formula 4 since 


v2 val |vi vaL |у va 
u: (vxw) =u: 6 мз | w30 + м] w2 J 
v2 УЗ У] УЗ У] v3 
= wa wat! wei wa? tlw wap 
uj ча из 


= |У] v3 v3 
wi W2 W3 


EXAMPLE 5 Calculating a Scalar Triple Product + 


Calculate the scalar triple product u · (v x w) of the vectors 
u—2Ji-2j—5k, v—i-4j—4k, w=3j+ 2k 


Solution From 7, 


20 5 
о: (хм) = |1 4 —4 
0 3 2 
4 -4 1-4 14 
= mit +(=5 
aE 3] ( 2 2 *“ |, ; 
—604-4— 15-49 


Remark The symbol (u - v) x w makes no sense because we cannot form the cross product of a scalar and a 
vector. Thus, no ambiguity arises if we write үү. y x w rather than u * (v x w). However, for clarity we will usually 
keep the parentheses. 


It follows from 7 that 
u:(vxw)-—w-:(uxv)-—v-(wxu) 


since the 3 x 3 determinants that represent these products can be obtained from one another by two row 
interchanges. ( Verify.) These relationships can be remembered by moving the vectors u, v, and w clockwise around 
the vertices of the triangle in Figure 3.5.6. 


w v 


Figure 3.5.6 


Geometric Interpretation of Determinants 


The next theorem provides a useful geometric interpretation of 2 x 2 and 3 x 3 determinants. 


THEOREM 3.5.4 


(a) The absolute value of the determinant 
и] u2 
del vi A 
is equal to the area of the parallelogram in 2-space determined by the vectors u = (21, #3) and 
v = (v1, v3). (See Figure 3.5.7a.) 
(b) The absolute value of the determinant 
HQ иу u3 
det| V1. V2 УЗ 
wi W3 W3 
is equal to the volume of the parallelepiped in 3-space determined by the vectors u = (1, #3, #3), 
у= (v1, v3, уз), and w= (wy, wa, W3). (See Figure 3.5.7b.) 


AY Ac 


a / 
(04,05) | —— y / 
: (uy, и», из) 


(шу. us) 
u x 


(a) (b) (c) 
Figure 3.5.7 


Proof (a) The key to the proof is to use Theorem 3.5.3. However, that theorem applies to vectors in 3-space, 
whereas u = (24, #3) and v = (у, v3) are vectors in 2-space. To circumvent this “dimension problem,” we will 
view u and v as vectors in the xy-plane of an xyz-coordinate system (Figure 3.5.7c), in which case these vectors are 
expressed as u = (41,42, 0) and v = (у, уз, 0). Thus 


i j 
и] H3 
uxv-—|u, uj 0|— 


УІ V2 vi V2 


к= de y, ya |k 


vı v3 0 


It now follows from Theorem 3.5.3 and the fact that ||k|| = 1 that the area А of the parallelogram determined by и 
and v is 
и] H2 uj H3 
desi | \|k]| = ke | 


Proof (b) As shown in Figure 3.5.8, take the base of the parallelepiped determined by u, v, and w to be the 
parallelogram determined by v and w. It follows from Theorem 3.5.3 that the area of the base is ||v x || and, as 
illustrated in Figure 3.5.8, the height / of the parallelepiped is the length of the orthogonal projection of u on y x w 
. Therefore, by Formula 12 of Section 3.3, 


uy H2 
A= [ах v|| = чау, “т = 


which completes the proof. 


alis ju: (vxw) 


й = ||рго% лу [у x wi] 
It follows that the volume V of the parallelepiped is 


V = (area of base) - height = ||v "Lua u 


lvxwl —— 


u: бхз) 


so from 7, 
и] Hj иЗ 
V —|det vi v2 УЗ (8) 
Wi, W2 W3 


which completes the proof. 


h = |proj,, ull 


Figure 3.5.8 


Remark If V denotes the volume of the parallelepiped determined by vectors и, v, and w, then it follows from 
Formulas 7 and 8 that 


= ји: (vxw) 


(9) 


volume of parallelepiped 
— | determined by u, v, and w 


From this result and the discussion immediately following Definition 3 of Section 3.2, we can conclude that 
о: (уху) = +r 


where ће + or — results depending on whether и makes an acute or an obtuse angle with y x w. 


Formula 9 leads to a useful test for ascertaining whether three given vectors lie in the same plane. Since three 
vectors not in the same plane determine a parallelepiped of positive volume, it follows from 9 that 
ju (ух) | = 0 if and only if the vectors и, v, and w lie in the same plane. Thus we have the following result. 


THEOREM 3.5.5 


If the vectors u = (u1, #3, u3), V = (v1, v2, v3), and w= (wy, w2, w3) have the same initial point, then 
they lie in the same plane if and only if 


и] #2 E43 
о (уху) = |У] V2 V3|—0 
У] №2 #3 


Concept Review 


Cross product of two vectors 
Determinant form of cross product 


Scalar triple product 


Skills 


Compute the cross product of two vectors u and v in 23. 


Know the geometric relationship between y x y to u and v. 

Know the properties of the cross product (listed in Theorem 3.5.2). 
Compute the scalar triple product of three vectors in 3-space. 
Know the geometric interpretation of the scalar triple product. 


Compute the areas of triangles and parallelograms determined by two vectors or three points in 2-space 
or 3-space. 


Use the scalar triple product to determine whether three given vectors in 3-space are collinear. 


Exercise Set 3.5 


In Exercises 1-2, let u = (5, 2, — 1), v= (0, 2, = 3), and w= (2, 6, 7). Compute the indicated vectors. 


L a) vw 


(b) ux (v xw) 


(с) (Qux v) xw 


Answer: 


(а) (32, —6, —4) 
(b) (—14, —20, —82) 
(c) (27, 40, — 42) 


2. (a) (ux v) x (vxw) 

(b) Ux (v = 2w) 

(c) (ux v) —2w 
In Exercises 3—6, use the cross product to find a vector that is orthogonal to both u and v. 
3,u=(—6, 4, 2), v= (3, 1, 5) 

Answer: 


(18, 36, — 18) 
4.u=(1, 1, =2),¥=(2, —1,2) 
5,u=(—2, 1, 5), v= (3,0, = 3) 


Answer: 


(—5,9, = 3) 
6, u (3, 3, 1), v= (0,4, 2) 


In Exercises 7—10, find the area of the parallelogram determined by the given vectors u and v. 
7,u=(1, —1, 2), v= (0, 3, 1) 


Answer: 


y59 


g.u= (3, – 1, 4), v = (6, —2, 8) 
9,u= (2, 3,0), v=(—1, 2, —2) 


Answer: 


y101 


190,u= (1, 1, 1), v= (3, 2, —5) 
In Exercises 11—12, find the area of the parallelogram with the given vertices. 
11. P1C1, 2), P2(4, 4), Рз(7, 5), Pal, 3) 

Answer: 


3 
12. 713, 2), P20, 4), P39, 4), Ра(7, 2) 


In Exercises 13-14, find the area of the triangle with the given vertices. 
13. A(2, 0), B(3, 4), C( — 1, 2) 


Answer: 


1 
14. AQ, 1), 8(2, 2), С(3, = 3) 


In Exercises 15-16, find the area of the triangle in 3-space that has the given vertices. 
15. P1(2, 6, — 1), Р2(1, 1, 1), 23(4, 6, 2) 
Answer: 


374 
2 
16, P(1, — 1, 2), Q(0, 3, 4), R(6, 1, 8) 


In Exercises 17-18, find the volume of the parallelepiped with sides u, v, and w. 
17,u= (2, 26,2), v= (0,4, —2),w— (2, 2, —4) 
Answer: 


16 
18, ® = (2, 1,2), v = (4, 5, 1), w= (1, 2, 4) 


In Exercises 19—20, determine whether u, v, and w lie in the same plane when positioned so that their initial 
points coincide. 


19,u=(—1, 22, 1), v= (3, 0, 22),w— (5, —4,0) 
Answer: 


The vectors do not lie in the same plane. 
20.u— (5, 22,1, v= (4, 21, 1), w— (1, = 1,0) 


In Exercises 21—24, compute the scalar triple product ш. (v x w). 
21.0—(—2,0, 6), ж= (1, = 3, 1), ж =(—5,—1,1) 
Answer: 


—92 
22,u=(=—1,2,4), у= (3,4, 2-2), w=(=1, 2, 5) 


23, ч = (a, 0, 0), v= (0, 2, 0), w= (0, 0, с) 
Answer: 


abc 


24.u— (3, 21, 0, v—(2,4, 3), w—(5, —1,2) 
In Exercises 25—26, suppose that u · (v x w) = 3. Find 
25. (a) us (wx v) 


(b) wxw) u 


(с) Ww: (ux v) 


26. 


27. 


28. 
29. 


30. 


31. 


32. 


33. 
34. 
35. 


36. 


Answer: 


(a) —3 

(b) 3 

(c) 3 

(a) V: uxw) 
(b) uxw): v 


(c) Ус (wxw) 


(a) Find the area of the triangle having vertices A(1, 0, 1),5(0,2, 3), and C(2, 1, 0). 
(b) Use the result of part (a) to find the length of the altitude from vertex C to side AB. 


Answer: 


а) J26 
2 

©) ү26 
3 


Use the cross product to find the sine of the angle between the vectors u — (2, 3, — 6) and v — (2, 3, 6). 
Simplify (u + v) x (u — v). 

Answer: 

2(v xu) 

Let a = (21, 22, 43), b = (44, 22, 53), с = (c1, 02, 03), and d = (41, 42, d 3). Show that 


(a--d): (bxc)—a:(bxc)-d: (bxc) 
Let u, v, and w be nonzero vectors in 3-space with the same initial point, but such that no two of them are 
collinear. Show that 
(a) ах (v xw) lies in the plane determined by v and w. 
(b) (ux v) x w lies in the plane determined by u and v. 
Prove the following identities. 
(а) u-- kv) x vcuxv 
(b) ur (vxz) = —(uxz):v 
Prove: If a, b, c, and d lie in the same plane, then (a x b) х (c x d) — 0. 
Prove: If 0 is the angle between и and v and y - y + 0, then tanf = ||u x v || / (u - v). 
Show that if u, v, and w are vectors in ВЗ, no two of which are collinear, then u x (v x w) lies in the plane 
determined by v and w. 
It is a theorem of solid geometry that the volume of a tetrahedron is 3 (area of base) * (height). Use this result 


to prove that the volume of a tetrahedron whose sides are the vectors a, b, and c is l a: (bxc) 


6 (see the 


accompanying figure). 


Figure Ех-36 


37. Use the result of Exercise 26 to find the volume of the tetrahedron with vertices P, О, R, S. 
(a) P(—1,2,0,Q(2,1, = 3), RO, 1, 1), S(5, —2, 3) 
(b) P(0,0,0,0(1,2, = 1), &(3, 4, 0), 5( = 1, = 3, 4) 


38. Prove part (d) of Theorem 3.5.1. [Hint: First prove the result in the case where w =i = (1, 0, 0), then when 
w = ]:= (0, 1, 0), and then when w = k = (0, 0, 1). Finally, prove it for an arbitrary vector w= (у, W2, W3) 
by writing w = w1-- иэ] + w3k.] 

39. Prove part (e) of Theorem 3.5.1. [Hint: Apply part (a) of Theorem 3.5.2 to the result in part (d) of Theorem 
3.5.1.] 


40. Prove: 
(a) Prove (b) of Theorem 3.5.2. 
(b) Prove (c) of Theorem 3.5.2. 
(c) Prove (d) of Theorem 3.5.2. 
(d) Prove (e) of Theorem 3.5.2. 
(e) Prove (f) of Theorem 3.5.2. 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 
(a) The cross product of two nonzero vectors u and v is a nonzero vector if and only if u and v are not parallel. 
Answer: 


True 
(b) A normal vector to a plane can be obtained by taking the cross product of two nonzero and noncollinear vectors 
lying in the plane. 


Answer: 


True 


(c) The scalar triple product of u, v, and w determines a vector whose length is equal to the volume of the 
parallelepiped determined by u, v, and w. 


Answer: 


False 


(d) If u and v are vectors in 3-space, then ||v x u|| is equal to the area of the parallelogram determined by u and v. 


Answer: 


True 


(e) For all vectors u, v, and w in 3-space, the vectors {u x v) x wand u x (v x w) are the same. 
Answer: 


False 


(f) If u, v, and w are vectors in 2°, where u is nonzero and y x y =u x w, then v = №. 
Answer: 


False 
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Chapter 3 Supplementary Exercises 


1. Letu = ( = 2, 0, 4), v= (3, —1, 6), and w= (2, = 5, — 5). Compute 
(a) jv — Zu 
(b) u + v + w]| 
(c) the distance between —3y and y + Sw 
(d) Prou 
(e) u- (vx w)j 


(f) (—5v--w) x ((u- v)w) 


Answer: 


(a) 3v — 2u — (13, — 3, 10) 
(b) |u -- v || = i70 
(c) ү774 


(d) projyu = -Bh —5, -5) 


(e) u: (vx w) = — 122 
(f) (= Iv +w) x (Cu: v)w) = (— 3150, — 2430, 1170) 


N 


. Repeat Exercise 1 for the vectors u = 31 = 5] + k, v = — 2i + 2k, and w= = j+ 4k. 


. Repeat parts (a)-(d) of Exercise 1 for the vectors u = ( = 2, 6, 2, 1), v = ( = 3, 0, 8, 0), and 
w= (9, 1, —6, = 6). 


Ge 


Answer: 
(a) 3v -2u— (— 5, — 12, 20, – 2) 
(b) [аж || = V 106 


(c) y 2810 


d моюп= -12 = = 
(d) projyu 77 (9,1, = 6, = 6) 


4. Repeat parts (a)-(d) of Exercise 1 for the vectors u = (0, 5, 0, = 1, —2),v— (1, = 1, 6, —2, 0), and 
w-—(-—4, = 1,4, 0, 2). 


In Exercises 5—6, determine whether the given set of vectors forms an orthogonal set. If so, normalize each 
vector to form an orthonormal set. 


5.(—32, = 1, 19), (3, 21, 5), (1, 6, 2) 
Answer: 


Not an orthogonal set 


6.(—2,0, 1, (1, 1, 2), (1, —5,2) 
7. (a) The set of all vectors in 22 that are orthogonal to a nonzero vector is what kind of geometric object? 
(b) The set of all vectors in RÊ that are orthogonal to a nonzero vector is what kind of geometric object? 


(с) The set of all vectors in 22 that are orthogonal to two noncollinear vectors is what kind of geometric 
object? 

(d) The set of all vectors in 23 that are orthogonal to two noncollinear vectors is what kind of geometric 
object? 


Answer: 


(a) A line through the origin, perpendicular to the given vector. 
(b) A plane through the origin, perpendicular to the given vector. 
(c) {0} (the origin) 


(d) Aline through the origin, perpendicular to the plane containing the two noncollinear vectors. 


* Show that v4 = (s. 1, A and уз = E 3, — A are orthonormal vectors, and find a third vector ¥3 for 
which (vi, уз, v3) is an orthonormal set. 
9. True or False: If u and v are nonzero vectors such that ||u + v || 2 [ч || 2 + Iiv] 2 then u and v are 


orthogonal. 
Answer: 


True 

10. True or False: If u is orthogonal to w + w, then u is orthogonal to v and w. 

11. Consider the points P(3, — 1, 4), Q(6, 0, 2), and R(5, 1, 1). Find the point S in R? whose first 
component is —] and such that PQ is parallel to py. 


em 


Answer: 
s(—1, —1,5) 
12. Consider the points P( = 3, 1, 0, 6), Q(0, 5, 1, — 2), and R( —4, 1, 4, 0). Find the point 5 in р“ whose 


third component is 6 and such that PQ is parallel to Re. 


13. Using the points in Exercise 11, find the cosine of the angle between the vectors PO and pp. 
Answer: 
[14 
17 
14. Using the points in Exercise 12, find the cosine of the angle between the vectors PQ and pg. 


15. Find the distance between the point P( — 3, 1, 3) and the plane 5x +z = 3y — 4. 


Answer: 


1 


{35 


16. Show that the planes 3x — у + 6z = 7 and —6x + 2y — 12z = 1 are parallel, and find the distance 
between the planes. 


In Exercises 17—22, find vector and parametric equations for the line or plane in question. 

17. The plane in &? that contains the points P( — 2, 1, 3), Q( — 1, — 1, 1), and R(3, 0, — 2). 
Answer: 
Vector equation: (x, y, 2) = (= 2, 1,3) +411, —2, —2) -£9205, —1, —5y 


parametric equations: x = = 2 + # + 59, y —1 —2£4 =, z 3 — 214 — Sta 
18. The line in 2? that contains the point P( — 1, 6, 0) and is orthogonal to the plane 4x =z = 5. 


19. The line in 22 that is parallel to the vector v = (8, — 1) and contains the point P(0, — 3). 


Answer: 
Vector equation: (x, y) = (0, — 3) --£(8, — 1); 


parametric equations: x = 8£, y= —3—£ 
20. The plane in RÊ that contains the point P( — 2, 1, 0) and parallel to the plane gx 4 бу —z-4. 
21. The line in д2 with equation y = 3x — 5. 

Answer: 


A possible answer is vector equation: (x, у) = (0, — 5) + £(1, 3); parametric equations: 
х=, у= –5 + 31 
22. The plane іп 22 with equation 2x — бу + 3z = 5. 


In Exercises 23—25, find a point-normal equation for the given plane. 


23. The plane that is represented by the vector equation 
(x,y,z)—(—1,5,6) + £4(0, —1, 3) + £2(2, = 1, 0). 


Answer: 
3(x + 1) + 6(у = 5) + 2(2= 6) =0 

24. The plane that contains the point P( — 5, 1, 0) and is orthogonal to the line with parametric equations 
x—3-—5ty-2t and = 7. 

25. The plane that passes through the points P(9, 0,4), Q( — 1, 4, 3), and R(0, 6, — 2). 


Answer: 


—18(х — 9) — 51у -24(z-4) =0 


26. Suppose that (v4, уз, уз} and (wi, м} are two sets of vectors such that у; and Wj are orthogonal for 
all i and j. Prove that if a1, 272, 23, b1, 53 are any scalars, then the vectors v = ау! + аууз + a3v3 and 
w= byw + bzw? are orthogonal. 

27. Prove that if two vectors u and v in 22 are orthogonal to a nonzero vector w in 22, then u and v are scalar 
multiples of each other. 

28. Prove that ||u + || = ||u]| + || v|| if and only if u and v are parallel vectors. 

29. The equation Ах --By = 0 represents a line through the origin іп д2 if А and B are not both zero. What 
does this equation represent in 2? if you think of it as Ax --Zy + 02 = 0? Explain. 


Answer: 


A plane 
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INTRODUCTION 


Recall that we began our study of vectors by viewing them as directed line segments 
(arrows). We then extended this idea by introducing rectangular coordinate systems, which 
enabled us to view vectors as ordered pairs and ordered triples of real numbers. As we 
developed properties of these vectors we noticed patterns in various formulas that enabled 
us to extend the notion of a vector to an n-tuple of real numbers. Although w-tuples took 
us outside the realm of our *visual experience," it gave us a valuable tool for 
understanding and studying systems of linear equations. In this chapter we will extend the 
concept of a vector yet again by using the most important algebraic properties of vectors 
in R" as axioms. These axioms, if satisfied by a set of objects, will enable us to think of 
those objects as vectors. 
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4.1 Real Vector Spaces 


In this section we will extend the concept of a vector by using the basic properties of vectors іп R” as axioms, which if satisfied 
by a set of objects, guarantee that those objects behave like familiar vectors. 


Vector Space Axioms 


The following definition consists often axioms, eight of which are properties of vectors in 2” that were stated in Theorem 3.1.1. 
It is important to keep in mind that one does not prove axioms; rather, they are assumptions that serve as the starting point for 
proving theorems. 


Vector space scalars can be real numbers or complex 
numbers. Vector spaces with real scalars are called real 
vector spaces and those with complex scalars are called 
complex vector spaces. For now we will be concerned 
exclusively with real vector spaces. We will consider 
complex vector spaces later. 


DEFINITION 1 


Let V be an arbitrary nonempty set of obj ects on which two operations are defined: addition, and multiplication by 
scalars. By addition we mean a rule for associating with each pair of objects u and v in V an object u + y, called the 
sum of и and v; by scalar multiplication we mean a rule for associating with each scalar k and each object u in V an 
object ku, called the scalar multiple of u by k. If the following axioms are satisfied by all objects u, у, w in V and all 
scalars k and m, then we call V a vector space and we call the objects in V vectors. 


1. Ifuandv are objects in У, then u + y is in V. 

2. uc-v—v-Fu 

3. u+ (v +w) = (u-F v) + 

4. There is an object 0 in V, called a zero vector for V, such that Q + u = u + 0 = u for all u in V. 

5. For each u in V, there is an object —y in V, called a negative of u, such that u + ( Cu) = ( —u) +u=0. 
6. Ifkis any scalar and и is any object in V, then ku is in V. 

7. k(u-- v) =ku-+ kv 

8. (k-Fm)u-ku-»u 

9, (и) = (km) (u) 

10. lu—-u 


Observe that the definition of a vector space does not specify the nature of the vectors or the operations. Any kind of object can 
be a vector, and the operations of addition and scalar multiplication need not have any relationship to those on R". The only 
requirement is that the ten vector space axioms be satisfied. In the examples that follow we will use four basic steps to show 
that a set with two operations is a vector space. 


To Show that a Set with Two Operations is a Vector Space 
Step 1 Identify the set V of objects that will become vectors. 


Step 2 Identify the addition and scalar multiplication operations on V. 

Step 3 Verify Axioms | and 6; that is, adding two vectors in V produces a vector in V, and multiplying a vector in V by 
a scalar also produces a vector in V. Axiom 1 is called closure under addition, and Axiom 6 is called closure under 
scalar multiplication. 


Step 4 Confirm that Axioms 2, 3, 4, 5, 7, 8, 9, and 10 hold. 


Hermann Günther Grassmann (1809-1877) 


Historical Note The notion of an “abstract vector space" evolved over many years and had many contributors. The 
idea crystallized with the work of the German mathematician H. G. Grassmann, who published a paper in 1862 in which 
he considered abstract systems of unspecified elements on which he defined formal operations of addition and scalar 
multiplication. Grassmann's work was controversial, and others, including Augustin Cauchy (p. 137), laid reasonable 
claim to the idea. 

[Image: (c)Sueddeutsche Zeitung Photo/The Image Works] 


Our first example is the simplest of all vector spaces in that it contains only one object. Since Axiom 4 requires that every 
vector space contain a zero vector, the object will have to be that vector. 


EXAMPLE 1 The Zero Vector Space <4 


Let V consist of a single object, which we denote by 0, and define 

0+0=0 and &0—0 
for all scalars К. It is easy to check that all the vector space axioms are satisfied. We call this the zero vector 
space. 


Our second example is one of the most important of all vector spaces—the familiar space R”. It should not be surprising that 
the operations on д” satisfy the vector space axioms because those axioms were based on known properties of operations on R” 


EXAMPLE 2 R'Isa Vector Space Ж 


Let 7 = R”, and define the vector space operations on V to be the usual operations of addition and scalar 
multiplication of n-tuples; that is, 
u-c-v = (u,,H2,.., Mà) + (V1, V2, ..., Уу) = (41 F v1, u2 + V2, -n iy Vy) 
ku zm (4м, ku2, ..., kün) 


The set р = R” is closed under addition and scalar multiplication because the foregoing operations produce 


n-tuples as their end result, and these operations satisfy Axioms 2, 3, 4, 5, 7, 8, 9, and 10 by virtue of Theorem 
3.1.1. 


Our next example is a generalization of R” in which we allow vectors to have infinitely many components. 
EXAMPLE 3 The Vector Space of Infinite Sequences of Real Numbers + 


Let V consist of objects of the form 
U= (121, 92, -n Hp, ---) 
in which u1, #2, ..., tm, .... is an infinite sequence of real numbers. We define two infinite sequences to be equal if 
their corresponding components are equal, and we define addition and scalar multiplication componentwise by 
uty = (u,,H2,.., Mp, ...) + (V1, V3, -r Уу...) 
= (u1 Фу, 424+ v2, ... y F YVy, ---) 
ku = (kuj, kup, Kiin, ...) 
We leave it as an exercise to confirm that V with these operations is a vector space. We will denote this vector 
space by the symbol R^". 


In the next example our vectors will be matrices. This may be a little confusing at first because matrices are composed of rows 
and columns, which are themselves vectors (row vectors and column vectors). However, here we will not be concerned with the 
individual rows and columns but rather with the properties of the matrix operations as they relate to the matrix as a whole. 


Note that Equation 1 involves three different addition 
operations: the addition operation on vectors, the 
addition operation on matrices, and the addition 
operation on real numbers. 


EXAMPLE 4 A Vector Space of 2 x 2 Matrices <d 


Let V be the set of 2 x 2 matrices with real entries, and take the vector space operations on V to be the usual 
operations of matrix addition and scalar multiplication; that is, 


be i| ie isl p Fyi 012 +У12 
u-d-v-— + = 


421 822 Ya У22 u21 V21 022 + У22 
(1) 
чи ы Б шур kuiz 
#21 422 kuz, kun 


The set Vis closed under addition and scalar multiplication because the foregoing operations produce 2 x 2 
matrices as the end result. Thus, it remains to confirm that Axioms 2, 3, 4, 5, 7, 8, 9, and 10 hold. Some of these 
are standard properties of matrix operations. For example, Axiom 2 follows from Theorem 1.4.1a since 


ааа | u 212] [vno var] Гуш var] [enr ia) 
qua 422 val va| |va У22 uj uml 
Similarly, Axioms 3, 7, 8, and 9 follow from parts (5), (Л), (7), and (e), respectively, of that theorem (verify). This 
leaves Axioms 4, 5, and 10 that remain to be verified. 


To confirm that Axiom 4 is satisfied, we must find a 2 x 2 matrix 0 in V for which u + 0 = 0 + u for all 2 x 2 
matrices in V. We can do this by taking 
0 0 
0— 


With this definition, 
dane 10 9|, [91 wm] [m s]... 
~ 10 0| [92 uzaj [ча x2] - 
and similarly u + 0 = u. To verify that Axiom 5 holds we must show that each object u in V has a negative —y in 
V such that u + ( — и) = 0 and ( — u) + u = 0. This can be done by defining the negative of u to be 


With this definition, 
Hi, &12 —H]] 12 0 0 
_ = = =0 
ыы, E шы, | le Ses] b | 
and similarly ( — u) + u = 0. Finally, Axiom 10 holds because 


а 1111 omm] [ru si], 
uj uy uj 422 


EXAMPLE 5 The Vector Space of m x n Matrices — 


Example 4 is a special case of a more general class of vector spaces. You should have no trouble adapting the 
argument used in that example to show that the set V of all jj; x х matrices with the usual matrix operations of 
addition and scalar multiplication is a vector space. We will denote this vector space by the symbol M my. Thus, 
for example, the vector space in Example 4 is denoted as M 55. 


In Example 6 the functions were defined on the entire 
interval (— со, oo ). However, the arguments used in 
that example apply as well on all subin-tervals of 

(— со, со ), such as a closed interval [a, b] or an open 
interval (a, b). We will denote the vector spaces of 
functions on these intervals by F[a, b] and F(a, b), 
respectively. 


EXAMPLE 6 The Vector Space of Real-Valued Functions + 


Let V be the set of real-valued functions that are defined at each x in the interval (— со, oo ). Iff = f (x) and 
g = g(x) are two functions in V and if k is any scalar, then define the operations of addition and scalar 
multiplication by 


(f+ g)(x) =f (х) + а(х) (2) 


(kf) (x) = Ау (x) (3) 


One way to think about these operations is to view the numbers f(x) and g(x) as “components” of f and g at the 
point x, in which case Equations 2 and 3 state that two functions are added by adding corresponding components, 
and a function is multiplied by a scalar by multiplying each component by that scalar—exactly as in R” and R™. 
This idea is illustrated in parts (a) and (b) of Figure 4.1.1. The set V with these operations is denoted by the 
symbol F(— со, co ). We can prove that this is a vector space as follows: 


> 


Axioms 1 and 6 These closure axioms require that if we add two functions that are defined at each x in the 
interval ( — со, c ), then sums and scalar multiples of those functions are also defined at each x in the interval 
(= со, со ). This follows from Formulas 2 and 3. 


Axiom 4 This axiom requires that there exists a function 0 in F( — со, oo ), which when added to any other 
function f in F( — со, c ) produces f back again as the result. The function, whose value at every point x in the 
interval ( — со, со ) is zero, has this property. Geometrically, the graph of the function 0 is the line that 
coincides with the x-axis. 

Axiom 5 This axiom requires that for each function fin F( — со, oo ) there exists a function —f in 

F(—o0o, oo), which when added to f produces the function 0. The function defined by —f (x) = — f (х) has 
this property. The graph of —f can be obtained by reflecting the graph of f about the x-axis (Figure 4.1.1c). 
Axioms 2,3,7,8,9,10 The validity of each of these axioms follows from properties of real numbers. For example, 
if f and g are functions in F( — со, со ), then Axiom 2 requires that f + g = g +f. This follows from the 
computation 


(f + в) (х) = f(x) + gG) = в(х) -- f(x) = (g -- f)(x) 
in which the first and last equalities follow from 2, and the middle equality is a property of real numbers. We will 
leave the proofs of the remaining parts as exercises. 


(a) (b) (c) 


Figure 4.1.1 


It is important to recognize that you cannot impose any two operations on any set V and expect the vector space axioms to hold. 
For example, if V is the set of n-tuples withpositive components, and if the standard operations from R" are used, then V is not 
closed under scalar multiplication, because if u is a nonzero n-tuple in V, then ( —1)u has at least one negative component and 
hence is not in V. The following is a less obvious example in which only one of the ten vector space axioms fails to hold. 


EXAMPLE 7 ASet That Is Not a Vector Space < 


Let ү = 52 and define addition and scalar multiplication operations as follows: If u = (21, 12) and = (v4, уз) 
, then define 
u+v= (uj vi, 02 + v2) 
and if k is any real number, then define 
ku = (ku, 0) 

For example, if u = (2, 4), v = (—3, 5), and p = 7, then 

u--v—í(24-(—3),4-4-5) —(—1, 9) 

ku — 7u — (7 - 2, 0) = (14, 0) 
The addition operation is the standard one from 22, but the scalar multiplication is not. In the exercises we will 


ask you to show that the first nine vector space axioms are satisfied. However, Axiom 10 fails to hold for certain 
vectors. For example, if u = (1, u2) is such that #3 # 0, then 


lu—1(uj,u3) = (1: u1, 0) = (u1, 0) £u 


Thus, V is not a vector space with the stated operations. 


Our final example will be an unusual vector space that we have included to illustrate how varied vector spaces can be. Since the 
objects in this space will be real numbers, it will be important for you to keep track of which operations are intended as vector 
operations and which ones as ordinary operations on real numbers. 


EXAMPLE 8 Ап Unusual Vector Space + 


Let V be the set of positive real numbers, and define the operations on V to be 
uU--v-uv [Vector addition is numerical multiplication. ] 
ku =u" [Scalar multiplication is numerical exponentiation. ] 
Thus, for example, 1 + 1 = 1 and (2)(1) = 1? = |—-strange indeed, but nevertheless the set V with these 


operations satisfies the 10 vector space axioms and hence is a vector space. We will confirm Axioms 4, 5, and 7, 
and leave the others as exercises. 


* Axiom 4— The zero vector in this space 15 the number 1 (1.е., 0 = 1) since 
u-l-u:l-u 


* Axiom 5— The negative of a vector u is its reciprocal (1.е., 25 = 1 / g) since 


ич 1=:2)=1=0) 


м 


* Axiom 7—k(u + v) = (иу) = uy" = (ku) + (kv) 


Some Properties of Vectors 


The following is our first theorem about general vector spaces. As you will see, its proof is very formal with each step being 
justified by a vector space axiom or a known property of real numbers. There will not be many rigidly formal proofs of this type 
in the text, but we have included these to reinforce the idea that the familiar properties of vectors can all be derived from the 
vector space axioms. 


THEOREM 4.1.1 


Let V be a vector space, u a vector in V, and k a scalar; then: 


(а) 0u — 0 
(5) &0—0 
(c) (-1)ju— =u 


(d) If ky = 0, then — 0 or y — Q. 


We will prove parts (a) and (c) and leave proofs of the remaining parts as exercises. 


Proof (3) We can write 


Оо + Оо = (0--0)u [Axiom 8] 
= 0и [Property of the number 0] 


By Axiom 5 the vector Ou has a negative, Оц. Adding this negative to both sides above yields 
[Ou + Qu] + ( —0u) = Qu + ( —0u) 


ог 
Ou + [Qu + (20u)] = 0u + (—би) [Axiom 3] 
Ou 4-0 —0 [Axiom 5] 
0u — 0 [Axiom 4] 

Proof (c) To prove that (—1}u= — u, we must show that u + ( —1)u = 0. The proof is as follows: 


u--(—1)u—lu--(—1)u [Axiom 10] 
—(1--(-1)u [Axiom 8] 
= 0u [Property of numbers] 
=0 [Part (a) of this theorem] 


A Closing Observation 


This section of the text is very important to the overall plan of linear algebra in that it establishes a common thread between 
such diverse mathematical objects as geometric vectors, vectors in R”, infinite sequences, matrices, and real-valued functions, 
to name a few. As a result, whenever we discover a new theorem about general vector spaces, we will at the same time be 
discovering a theorem about geometric vectors, vectors in R”, sequences, matrices, real-valued functions, and about any new 
kinds of vectors that we might discover. 


To illustrate this idea, consider what the rather innocent-looking result in part (a) of Theorem 4.1.1 says about the vector space 
in Example 8. Keeping in mind that the vectors in that space are positive real numbers, that scalar multiplication means 
numerical exponentiation, and that the zero vector is the number 1, the equation 


Qu = 0 
15 a statement of the fact that if и is a positive real number, then 
и0 = 1 


Concept Review 

* Vector space 

* Closure under addition 

* Closure under scalar multiplication 


* Examples of vector spaces 


Skills 
* Determine whether a given set with two operations is a vector space. 


* Show that a set with two operations is not a vector space by demonstrating that at least one of the vector space axioms 
fails. 


Exercise Set 4.1 


1. 


r2 


Let V be the set of all ordered pairs of real numbers, and consider the following addition and scalar multiplication operations 
onu = (u,, #2) and v = (v1, уз): 
u--v—(u,--vi,u3 +y), ба = (0, kua) 
(a) Compute u + y and ku for u = (—1, 2), v = (3, 4) and  — 3. 
(b) In words, explain why V is closed under addition and scalar multiplication. 
(c) Since addition on V is the standard addition operation on 22, certain vector space axioms hold for V because they are 
known to hold for 22. Which axioms are they? 


(d) Show that Axioms 7, 8, and 9 hold. 


(e) Show that Axiom 10 fails and hence that V is not a vector space under the given operations. 


Answer: 


(a) ч+у= (2, 6), 3u — (0, 6) 
(c) Axioms 1-5 


. Let V be the set of all ordered pairs of real numbers, and consider the following addition and scalar multiplication operations 


onu = (41, #2) and v = (у, уз): 
u+ v= (иу +v +1, u3 у 4 1), ба = (kuj, kuz) 
(a) Compute u -+ y and ku for u = (0, 4), у = (1, — 3), and c 2. 
(b) Show that (0, 0) #0. 
(c) Show that ( 1, = 1) = 0. 
(d) Show that Axiom 5 holds by producing an ordered pair —y such that u + ( —u) = 0 for u = (41,42). 


(e) Find two vector space axioms that fail to hold. 


In Exercises 3-12, determine whether each set equipped with the given operations is a vector space. For those that are not 
vector spaces identify the vector space axioms that fail. 


ш 


л 


чм е 


> 0 


. The set of all real numbers with the standard operations of addition and multiplication. 


Answer: 


The set is a vector space with the given operations. 


. The set of all pairs of real numbers of the form (x, 0) with the standard operations on 22. 


. The set of all pairs of real numbers of the form (x, y), where x > 0, with the standard operations on R?. 


Answer: 


Not a vector space, Axioms 5 and 6 fail. 


. The set of all n-tuples of real numbers that have the form (x, x, ..., x) with the standard operations оп 5”. 


. The set of all triples of real numbers with the standard vector addition but with scalar multiplication defined by 


k(x,y,z)— (кх, ky, k?z) 


Answer: 


Not a vector space. Axiom 8 fails. 


. The set of all 2 x 2 invertible matrices with the standard matrix addition and scalar multiplication. 


. The set of all 2 x 2 matrices of the form 


HH 


10. 


п. 


12. 


13. 
14. 
15. 
16. 
17. 


18. 


with the standard matrix addition and scalar multiplication. 
Answer: 


The set is a vector space with the given operations. 


The set of all real-valued functions f defined everywhere on the real line and such that f (1) = 0 with the operations used in 
Example 6. 


The set of all pairs of real numbers of the form (1, x) with the operations 
(1, y) + 22") == (1,у +y’) and &(1, y) = (1, ky) 
Answer: 


The set is a vector space with the given operations. 


The set of polynomials of the form ag + @1Х with the operations 
(ag + ах) + (bg + ух) = (ag + bg) + (a1 4- bi)x 
and 
klag + a1x) = (kag) + (kai)x 
Verify Axioms 3, 7, 8, and 9 for the vector space given in Example 4. 


Verify Axioms 1, 2, 3, 7, 8, 9, and 10 for the vector space given in Example 6. 

With the addition and scalar multiplication operations defined in Example 7, show that p” — 22 satisfies Axioms 1-9. 
Verify Axioms 1, 2, 3, 6, 8, 9, and 10 for the vector space given in Example 8. 

Show that the set of all points in 22 lying on a line is a vector space with respect to the standard operations of vector 
addition and scalar multiplication if and only if the line passes through the origin. 

Show that the set of all points in 23 lying in a plane is a vector space with respect to the standard operations of vector 
addition and scalar multiplication if and only if the plane passes through the origin. 


In Exercises 19—21, prove that the given set with the stated operations is a vector space. 


19. 
20. 


21. 
22. 
23. 


24. 
25. 


The set / = {0} with the operations of addition and scalar multiplication given in Example 1. 


The set R^* of all infinite sequences of real numbers with the operations of addition and scalar multiplication given in 
Example 3. 


The set M my of all jj; x »; matrices with the usual operations of addition and scalar multiplication. 
Prove part (d) of Theorem 4.1.1. 


The argument that follows proves that if u, v, and w are vectors in a vector space V such that u + w = v + w, Шеп y = y 
(the cancellation law for vector addition). As illustrated, justify the steps by filling in the blanks. 


u-d-w-v--w Hypothesis 
(и tw) + (—w) = (v Hw) + (—w) Add—w to both sides. 
u+ [w--(—w)] =v t [w+ (—w)] 
u+0=v+0 
u—v 
Let v be any vector in a vector space V. Prove that (y = Q. 


Below is a seven-step proof of part (b) of Theorem 4.1.1. Justify each step either by stating that it is true by hypothesis or by 
specifying which of the ten vector space axioms applies. 


Hypothesis: Let u be any vector in a vector space V, let 0 be the zero vector in V, and let k be a scalar. 


Conclusion: Then kQ — 0). 


Proof: 

(1) 0 + а= K(0 + u 

(2) = ku 

(3) Since ku is in V, —ku is in V. 


(4) Therefore, (А0 + ku + (-ku = ku + (Ха). 


(5) kO + (ku + (—ku)) = ku + (—ku) 
(6) k0+0=0 
(7) k0-0 


26. Let v be any vector in a vector space V. Prove that —v = ( —1)v. 


27. Prove: If и is a vector in a vector space V and k a scalar such that {ду = 0), then either ip — 0 or y = 0). [Suggestion: Show 
that if zy = 0 and ¢ x Q, then у = Q. The result then follows as a logical consequence of this.] 


True-False Exercises 
In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 
(a) A vector is a directed line segment (an arrow). 

Answer: 


False 


(b) A vector is an n-tuple of real numbers. 
Answer: 


False 


(c) A vector is any element of a vector space. 
Answer: 


True 


(d) There is a vector space consisting of exactly two distinct vectors. 
Answer: 


False 


(e) The set of polynomials with degree exactly 1 is a vector space under the operations defined in Exercise 12. 
Answer: 


False 
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4.2 Subspaces 


It is possible for one vector space to be contained within another. We will explore this idea in this section, we 
will discuss how to recognize such vector spaces, and we will give a variety of examples that will be used in 
our later work. 


We will begin with some terminology. 


DEFINITION 1 


A subset W of a vector space V is called a subspace of V if W is itself a vector space under the addition 
and scalar multiplication defined on V. 


In general, to show that a nonempty set W with two operations is a vector space one must verify the ten vector 
space axioms. However, if W is a subspace of a known vector space V, then certain axioms need not be verified 
because they are “inherited” from V. For example, it is not necessary to verify that u + y = y + u holds in W 
because it holds for all vectors in V including those in W. On the other hand, it is necessary to verify that W is 
closed under addition and scalar multiplication since it is possible that adding two vectors in W or multiplying a 
vector in W by a scalar produces a vector in V that is outside of W (Figure 4.2.1). 


Figure 4.2.1 The vectors u and v are in W, but the vectors u + y and ku are not 


Those axioms that are not inherited by W are 

Axiom 1—Closure of W under addition 

Axiom 4—Existence of a zero vector in W 

Axiom 5—Existence of a negative in W for every vector in W 
Axiom 6—Closure of W under scalar multiplication 


so these must be verified to prove that it is a subspace of V. However, the following theorem shows that if 
Axiom 1 and Axiom 6 hold in W, then Axioms 4 and 5 hold in W as a consequence and hence need not be 
verified. 


THEOREM 4.2.1 


If W is a set of one or more vectors in a vector space V, then W is a subspace of V if and only if the 
following conditions hold. 
(a) If wand v are vectors in W, then u + y is in W. 


(b) If k is any scalar and u is any vector in W, then ku is in W. 


In words, Theorem 4.2.1 states that Wis a 
subspace of V if and only if it is closed under 
addition and scalar multiplication. 


Proof If W is a subspace of У, then all the vector space axioms hold in W, including Axioms 1 and 6, which 
are precisely conditions (a) and (b). 


Conversely, assume that conditions (a) and (5) hold. Since these are Axioms 1 and 6, and since Axioms 2, 3, 7, 
8, 9, and 10 are inherited from V, we only need to show that Axioms 4 and 5 hold in W. For this purpose, let u 
be any vector in W. It follows from condition (5) that ku is a vector in W for every scalar К. In particular, 

Qu = апа (Пи = — u are in W, which shows that Axioms 4 and 5 hold in W. 


Note that every vector space has at least two 
subspaces, itself and its zero subspace. 


EXAMPLE 1 The Zero Subspace < 


If V is any vector space, and if = {0} is the subset of V that consists of the zero vector only, 
then W is closed under addition and scalar multiplication since 


0+0=0 and 0 — 0 
for any scalar К. We call W the zero subspace of V. 


EXAMPLE 2 Lines Through the Origin Are Subspaces of R^ and of È? <4 


If W is a line through the origin of either 22 ог д2, then adding two vectors on the line W or multiplying : 


on the line W by a scalar produces another vector on the line W, so W is closed under addition and scalar 
multiplication (see Figure 4.2.2 for an illustration in 23). 


(a) W is closed under addition. (b) W is closed under scalar 
multiplication. 


Figure 4.2.2 


EXAMPLE 3 Planes Through the Origin AreSubspaces of к + 


If u and v are vectors in a plane W through the origin of 23, then it is evident geometrically that u Fv 


and ku lie in the same plane W for any scalar k (Figure 4.2.3). Thus W is closed under addition and 
scalar multiplication. 


Figure 4.2.3 The vectors u + y and ku both lie in the same plane as u and v 


Table 1 that follows gives a list of subspaces of 22 and of 23 that we have encountered thus far. We will see 
later that these are the only subspaces of 22 and of 23. 


Table 1 
Subspaces of 22 Subspaces of R3 
• (0j • (0j 
e Lines through the origin * Lines through the origin 
e R? e Planes through the origin 
° R? 


EXAMPLE 4 А Subset of R? That Is Not a Subspace '* 


Let W be the set of all points (x, y) їп 22 for which x > 0 and y > 0 (the shaded region in Figure 
4.2.4). This set is not a subspace of 22 because it is not closed under scalar multiplication. For 
example, v = (1, 1) isa vector in W, but (—l)w = (—1, — 1) is not. 


Figure 4.2.4 Wis not closed under scalar multiplication 


EXAMPLE 5 Subspaces of Man <4 


We know from Theorem 1.7.2 that the sum of two symmetric » x »; matrices is symmetric and 
that a scalar multiple of a symmetric » x »; matrix is symmetric. Thus, the set of symmetric » x » 
matrices is closed under addition and scalar multiplication and hence is a subspace of M py- 
Similarly, the sets of upper triangular matrices, lower triangular matrices, and diagonal matrices 
are subspaces of M yy- 


EXAMPLE 6 A Subset of Mnn That Is Nota Subspace «4 


The set W of invertible › x » matrices is not a subspace of M pp» failing on two counts—it is not 
closed under addition and not closed under scalar multiplication. We will illustrate this with an 
example in Л зз that you can readily adapt to M „y. Consider the matrices 


Ls =] 2 
U = and = 
2 5| =! ШЕН 
The matrix OU is the 2 x 2 zero matrix and hence is not invertible, and the matrix 77 + P has a 
column of zeros, so it also is not invertible. 


CALCULUS REQUIRED 


EXAMPLE 7 The Subspace C(-~,~) <d 


There is a theorem in calculus which states that a sum of continuous functions is continuous and 
that a constant times a continuous function is continuous. Rephrased in vector language, the set 
of continuous functions on (= со, oo ) is a subspace of F( — со, oo ). We will denote this 


subspace by C( — œ, со). 


CALCULUS REQUIRED 


EXAMPLE 8 Functions with Continuous Derivatives + 


A function with a continuous derivative is said to be continuously differentiable. There 1s a 
theorem in calculus which states that the sum of two continuously differentiable functions is 
continuously differentiable and that a constant times a continuously differentiable function is 
continuously differentiable. Thus, the functions that are continuously differentiable on 

(— oo, oo ) form a subspace of F( — со, со ). We will denote this subspace by 

e (— со, oo ), where the superscript emphasizes that the first derivative is continuous. To take 
this a step further, the set of functions with m continuous derivatives on (— со, оо ) isa 
subspace of F( — со, co ) as is the set of functions with derivatives of all orders on 

(— co, со). We will denote these subspaces by C" (— oo, оо ) and C ^ (— со, со), 


respectively. 


EXAMPLE 9 The Subspace of All Polynomials + 


Recall that a polynomial is a function that can be expressed in the form 
p(x) = agta t + + Hayr” (1) 


where @g, @1, * * *, a4, are constants. It is evident that the sum of two polynomials is a 
polynomial and that a constant times a polynomial is a polynomial. Thus, the set W of all 
polynomials is closed under addition and scalar multiplication and hence is a subspace of 
F(—o0o, oo). We will denote this space by P... 


EXAMPLE 10 The Subspace of Polynomials of Degrees n + 


Recall that the degree of a polynomial is the highest power of the variable that occurs with a 
nonzero coefficient. Thus, for example, if 2,, # 0 in Formula 1, then that polynomial has degree л. 
It is not true that the set W of polynomials with positive degree л is a subspace of F( — со, со) 
because that set is not closed under addition. For example, the polynomials 


1 2x + 3x? and 5 + 7x — 3x? 


both have degree 2, but their sum has degree 1. What is true, however, is that for each nonnegative 
integer n the polynomials of degree n or less form a subspace of F( — со, со ). We will denote 
this space by P. 


In this text we regard all constants to be 
polynomials of degree zero. Be aware, however, 
that some authors do not assign a degree to the 
constant 0. 


The Hierarchy of Function Spaces 


Itis proved in calculus that polynomials are continuous functions and have continuous derivatives of all orders 
оп (— со, oo ). Thus, it follows that P is not only a subspace of F( — oo , оо), as previously observed, but 
is also a subspace of C ^ (— oo, oo ). We leave it for you to convince yourself that the vector spaces 
discussed in Example 7 to Example 10 are “nested” one inside the other as illustrated in Figure 4.2.5. 


Figure 4.2.5 


Remark In our previous examples, and as illustrated in Figure 4.2.5, we have only considered functions that 
are defined at all points of the interval ( — со, со ). Sometimes we will want to consider functions that are 
only defined on some subinterval of (— со, ox ), say the closed interval [a, b] or the open interval (a, b). In 
such cases we will make an appropriate notation change. For example, C[a, 5] is the space of continuous 
functions on [a, b] and C(a, b) is the space of continuous functions on (a, b). 


Building Subspaces 


The following theorem provides a useful way of creating a new subspace from known subspaces. 


THEOREM 4.2.2 


ТЕЎ, Wa, ..., W, are subspaces of a vector space V, then the intersection of these subspaces is also a 
subspace of V. 


Note that the first step in proving Theorem 4.2.2 
was to establish that W contained at least one 
vector. This is important, for otherwise the 
subsequent argument might be logically correct 
but meaningless. 


Proof Let W be the intersection of the subspaces W1, W2, ..., Wp. This set is not empty because each of these 
subspaces contains the zero vector of V, and hence so does their intersection. Thus, it remains to show that W is 
closed under addition and scalar multiplication. 


To prove closure under addition, let u and v be vectors in W. Since W is the intersection of W1, Wa, ..., Wp, it 
follows that u and v also lie in each of these subspaces. Since these subspaces are all closed under addition, 
they all contain the vector u + y and hence so does their intersection W. This proves that W is closed under 
addition. We leave the proof that W is closed under scalar multiplication to you. 


Sometimes we will want to find the “smallest” subspace of a vector space V that contains all of the vectors in 
some set of interest. The following definition, which generalizes Definition 4 of Section 3.1, will help us to do 
that. 


If = 1, then Equation 2 has the form 
w = уу, in which case the linear combination 
is just a scalar multiple of v1. 


DEFINITION 2 


If w is a vector in a vector space V, then w is said to be a linear combination of the vectors 
V1, V2, ..., V, in V if w can be expressed in the form 


w= kivi 092 t 5с TE, (2) 


where #1, 3, ..., Æp are scalars. These scalars are called the coefficients of the linear combination. 


THEOREM 4.2.3 


If S — (wi, W2,.., Wy} 15 a nonempty set of vectors in a vector space V, then: 
(a) The set W of all possible linear combinations of the vectors in S is a subspace of V. 


(b) The set W in part (a) is the “smallest” subspace of V that contains all of the vectors in S in the sense 
that any other subspace that contains those vectors contains W. 


Proof (a) Let W be the set of all possible linear combinations of the vectors in S. We must show that S is 
closed under addition and scalar multiplication. To prove closure under addition, let 


u—cW| Hew t * * + -Fcyw, and v = kqw, + kgwa + с c c + Ew, 
be two vectors in S. It follows that their sum can be written as 
u-4-v—(c,-- kiwi + (c2 = k2)wo t - * + (0 +4,)w, 


which is a linear combination of the vectors in S. Thus, W is closed under addition. We leave it for you to prove 
that W is also closed under scalar multiplication and hence is a subspace of V. 


Proof (b) Let W be any subspace of V that contains all of the vectors in S. Since W' is closed under addition 
and scalar multiplication, it contains all linear combinations of the vectors in S and hence contains W. 


The following definition gives some important notation and terminology related to Theorem 4.2.3. 


DEFINITION 3 


The subspace of a vector space V that is formed from all possible linear combinations of the vectors in 
a nonempty set S is called the span of S, and we say that the vectors in S span that subspace. If 
S — (wi, W2, ..., Wy}, then we denote the span of S by 


span {W1,W2,...Wy} or span(5) 


EXAMPLE 11 The Standard Unit Vectors Span R” < 


Recall that the standard unit vectors in R" are 
e; = (1, 0, 0,..., 0), ез = (0, 1, 0,..., 05,..., Bx = (0, 0, 0, ...1) 
These vectors span R" since every vector v = (v1, v3, ..., Vy) in А” can be expressed as 
v= үе! + V3e2 + * * * --vygeg 
which is a linear combination of е1, ез, ..., е. Thus, for example, the vectors 
i= (1, 0, 0), j= (0, 1, 0), k= (0, 0, 1) 
span 2 since every vector v = (a, b, c) in this space can be expressed as 


v= (a, b,c) =а(1, 0, 0) + 200, 1,0) +¢(0, 0, 1) =ai+ bj ck. 


EXAMPLE 12 A Geometric View of Spanning іп R апа КЎ < 


(a) If v is a nonzero vector in д2 or 53 that has its initial point at the origin, then span{v}, which 
is the set of all scalar multiples of v, is the line through the origin determined by v. You should 
be able to visualize this from Figure 4.2.6a by observing that the tip of the vector kv can be 
made to fall at any point on the line by choosing the value of k appropriately. 


George William Hill (1838-1914) 


Historical Note The terms linearly independent and linearly dependent were 
introduced by Maxime Bócher (see p. 7) in his book Introduction to Higher Algebra, 
published in 1907. The term linear combination is due to the American mathematician 
G. W. Hill, who introduced it in a research paper on planetary motion published in 
1900. Hill was a “loner” who preferred to work out of his home in West Nyack, New 
York, rather than in academia, though he did try lecturing at Columbia University for a 
few years. Interestingly, he apparently returned the teaching salary, indicating that he 
did not need the money and did not want to be bothered looking after it. Although 
technically a mathematician, Hill had little interest in modern developments of 
mathematics and worked almost entirely on the theory of planetary orbits. 

[Image: Courtesy of the American Mathematical Society] 


(b) If v1 and V2 are nonzero vectors in 2? that have their initial points at the origin, then 
span (v1, V2}, which consists of all linear combinations of V4 and V7, is the plane through the 
origin determined by these two vectors. You should be able to visualize this from Figure 4.2.65 
by observing that the tip of the vector kv + ъз can be made to fall at any point in the 
plane by adjusting the scalars & and #2 to lengthen, shorten, or reverse the directions of the 
vectors бту and буул appropriately. 


span [v,. v4] 


kv, + У, 


(a) Span {v} is the line through the (b) Span {v}, v;] is the plane through the 
origin determined by v. origin determined by v, and v}. 


Figure 4.2.6 


EXAMPLE 13 ASpanning Setfor Рл «4 


The polynomials 1, x, х2, ..., X” span the vector space Р,, defined in Example 10 since each 
polynomial p in P, can be written as 

р=а0 +ајх +: +ayx" 
2 


> 


Py = spand lax, x7, D md 


which is a linear combination of 1, x, x - + , x". We can denote this by writing 


The next two examples are concerned with two important types of problems: 
e Given a set S of vectors in R” and a vector v in R”, determine whether v is a linear combination of the 
vectors in S. 


e Given a set S of vectors in R”, determine whether the vectors span 2”. 


EXAMPLE 14 Linear Combinations — ** 


Consider the vectors u = (1, 2, — 1) and v = (6, 4, 2) in д3. Show that w= (9, 2, 7) isa 
linear combination of u and v and that м’ = (4, = 1, 8) is not a linear combination of u and v. 


Solution In order for w to be a linear combination of u and v, there must be scalars & and k3 
such that w = &,u + буу; that is, 
(9, 2, 7) 2 k1(1, 2, = 1) + &3(6, 4, 2) 
Or 
(9, 2, 7) = (k1 + 649, 2k, + 489, — ky + 203) 


Equating corresponding components gives 


ki + 655 = 9 
286] +465 = 2 
= 24) = 7 
Solving this system using Gaussian elimination yields & = = 3, kz = 2, so 
w= —3u+2v 


Similarly, for w' to be a linear combination of u and v, there must be scalars £; and £3 such that 
w = kju + kav; that is, 

(4, =1,8) =4,(1, 2, =) + &3(6, 4, 2) 
or 


(4, — 1,8) = (ky + 6k, 2ky + Aka, — Кү + 2k) 


Equating corresponding components gives 


кү 649 = 4 
26] +465 = -=l 
=k + 2k) = 8 


This system of equations is inconsistent (verify), so no such scalars & and i; exist. 
Consequently, w' is not a linear combination of u and v. 


EXAMPLE 15 Testing for Spanning + 
Determine whether v, = (1, 1, 2), уз = (1, 0, 1), and v3 = (2, 1, 3) span the vector space 23. 


Solution We must determine whether an arbitrary vector b = (1, 22, b3) in g? can be 
expressed as a linear combination 
Б = ур £2v3 + k3v3 
of the vectors V1, V2, and V3. Expressing this equation in terms of components gives 
(bi, 22, 23) — k1(1, 1, 2) +4201, 0, 1) + &3(2, 1, 3) 
Or 
(21, 22, Ёз) = (k1 +42 + 2k3, ky -- Ёз, 2k, + 2 + 3k3) 
or 
ki 9 + 283 = bi 
Кү + 43 ba 
21 + 2 + 3k3 b3 
Thus, our problem reduces to ascertaining whether this system is consistent for all values of 5, 


55, and b3. One way of doing this is to use parts (е) and (g) of Theorem 2.3.8, which state that 
the system is consistent if and only if its coefficient matrix 


112 
4—|10 1 
243 


has a nonzero determinant. But this is not the case here; we leave it for you to confirm that 
det(.4) = 0, so V, V2, and V3 do not span 23. 


Solution Spaces of Homogeneous Systems 


The solutions of a homogeneous linear system 4x — () of m equations in n unknowns can be viewed as vectors 
in R". The following theorem provides a useful insight into the geometric structure of the solution set. 


THEOREM 4.2.4 


The solution set of a homogeneous linear system 4x = () in n unknowns is a sub space of R”. 


Proof Let W be the solution set for the system. The set W is not empty because it contains at least the trivial 
solution x = 0. 


To show that W is a subspace of 2”, we must show that it is closed under addition and scalar multiplication. To 
do this, let X and X2 be vectors in W. Since these vectors are solutions of Ах = 0, we have 
Ax, =0 and Ax; = 0 
It follows from these equations and the distributive property of matrix multiplication that 
A(x, + х2) = Ax; + 4x; —0 +0 = 0 
so W is closed under addition. Similarly, if k is any scalar then 
A(kx1) = kAx, = 0 — 0 

so W is also closed under scalar multiplication. 

Because the solution set of a homogeneous 

system in n unknowns is actually a subspace of 


R”, we will generally refer to it as the solution 
space of the system. 


EXAMPLE 16 Solution Spaces of Homogeneous Systems + 


Consider the linear systems 


(а) |1 -2 3 |[х 
— s |>|- 


0 
0 
3 —6 912 0 
(b) 1 -2 3|[x 0 
=з 7 =8 ДВ 0 
—2 4 —6|L7? 0 
(c) 1-2 3|[x 0 
-3 7 -8 ДВ 
4 1 2112 0 
(d) 10 0 0 |[х 0 
000 ДВ 0 
00 O}L4 0 
Solution 


(a) We leave it for you to verify that the solutions are 
х=2в— 3f, у=, z= 


from which it follows that 


x = 2y = 3z or x = 2y + 32 = 0 

This is the equation of a plane through the origin that has n = (1, — 2, 3) as a normal. 

(b) We leave it for you to verify that the solutions are 
x= = f, y= =É, Z=É 

which are parametric equations for the line through the origin that is parallel to the vector 

v—(—5, = 1,1). 
(c) We leave it for you to verify that the only solution is x = 0, у = 0, z = 0, so the solution 

space is {0}. 


(d) This linear system is satisfied by all real values of x, y, and z, so the solution space is all of 27 


Remark Whereas the solution set of every homogeneous system of m equations in n unknowns is a subspace 
of R”, it is never true that the solution set of a nonhomogeneous system of m equations in n unknowns is a 
subspace of £”. There аге two possible scenarios: first, the system may not have any solutions at all, and 
second, if there are solutions, then the solution set will not be closed under either addition or under scalar 
multiplication (Exercise 18). 


A Concluding Observation 


It is important to recognize that spanning sets are not unique. For example, any nonzero vector on the line in 
Figure 4.2.6a will span that line, and any two noncollinear vectors in the plane in Figure 4.2.65 will span that 
plane. The following theorem, whose proof we leave as an exercise, states conditions under which two sets of 
vectors will span the same space. 


THEOREM 4.2.5 


If S — (v1, v2,.., уу} and $' = (wi, wa,..., wj) are nonempty sets of vectors in a vector space V, 
then 


span { V1, V2, -- V.) —span (w|, W2, ..., Wk) 


if and only if each vector in S is a linear combination of those in S', and each vector in 5" is a linear 
combination of those in S. 


Concept Review 


* Subspace 


* Zero subspace 

* Examples of subspaces 

* Linear combination 

* Span 

* Solution space 

Skills 

* Determine whether a subset of a vector space 1s a subspace. 
* Show that a subset of a vector space is a subspace. 


* Show that a nonempty subset of a vector space is not a subspace by demonstrating that the set is 
either not closed under addition or not closed under scalar multiplication. 


* Given a set S of vectors in ®” and a vector v in R”, determine whether v is a linear combination of 
the vectors in S. 


e Given a set S of vectors in 2”, determine whether the vectors in S span R”. 


* Determine whether two nonempty sets of vectors in a vector space V span the same subspace of V. 


Exercise Set 4.2 


1. Use Theorem 4.2.1 to determine which of the following are subspaces of 23. 
(a) АП vectors of the form (a, 0, 0). 
(b) АП vectors of the form (a, 1, 1). 
(c) All vectors of the form (a, b, c), where b = a +c. 
(d) All vectors of the form (a, b, c), where b = a +e + 1. 
(e) АП vectors of the form (a, b, 0). 


Answer: 


(a), (с), (е) 
2. Use Theorem 4.2.1 to determine which of the following are subspaces of M „у. 
(a) The set of all diagonal » x »; matrices. 
(b) The set of all »; x м matrices A such that det(.4) = 0. 
(c) The set of all »; x »; matrices A such that tr( А) = 0. 
(d) The set of all symmetric »; x »; matrices. 
(e) The set of all » x »; matrices A such that 47 — — 4. 
(f) The set of all у хх matrices A for which Ах = Q has only the trivial solution. 


(g) The set of all x ж х matrices A such that 45 — BA for some fixed » x » matrix В. 


3. Use Theorem 4.2.1 to determine which of the following are subspaces of P5. 


(a) All polynomials go + ax 4 азх? | азх? for which ag = 0. 


(b) All polynomials gq + ax 4 ax" | азх? for which ag + a4 + @2 + a3 = 0. 
(с) All polynomials of the form gp + ajx 4 азх? | азх? in which 40, 41, 22, and 43 are integers. 


(d) All polynomials of the form 20 + a1x, where 25 and à, are real numbers. 


Answer: 


(а), (b). (d) 
4. Which of the following are subspaces of F( — со, со)? 
(a) All functions fin F( — oo, oo ) for which f (0) = 0. 
(b) All functions fin F( — со, o ) for which f (0) = 1. 
(c) All functions fin F( — oo, oo ) for which f (=x) = f (x). 
(d) АП polynomials of degree 2. 
5. Which of the following are subspaces of R^? 
(a) All sequences v in А of the form v = (v, 0, v, 0, v, 0, ...). 
(b) All sequences v in R™ of the form v = (v, 1, v, 1, v, 1,...). 
(c) All sequences v in А of the form v = (v, 2v, 4v, 8v, 16v,...) . 


d) All sequences in R^? whose components are 0 from some point on. 
q p p 


Answer: 


(а), (с), (d) 

6. A line L through the origin in 25 can be represented by parametric equations of the form x = gr, y — bt. 
and z = c£. Use these equations to show that L is a subspace of R? by showing that if v = (x1, y1, Z1) and 
V2 = (X3, уз, Z2) are points on L and & is any real number, then kv, and v, + v2 are also points on L. 

7. Which of the following are linear combinations of u = (0, = 2, 2) andv = (1, 3, = 1)? 

(a) (2,2,2) 
(b) 6,1,5) 
(c) (0, 4, 5) 
(d) (0, 0, 0) 


Answer: 


(a), (b), (d) 
8. Express the following as linear combinations ofu = (2, 1, 4), v= (1, = 1, 3), and w= (3, 2, 5). 
(а) (3 =7, = 15) 
(b) (6,11,6) 
(c) (0,0,0) 
(d) (7,8,9) 


9. Which of the following are linear combinations of 


[13] 
(b) | 
(с) | 


(à [—1 5 
7 1 


Answer: 


(a), (b). (с) 
10. In each part express the vector as a linear combination of p, = 2 + x + 452, р2= 1-х + 3x4, and 
рз = 3 + 2х + 5х2. 
(а) —9—7х— 15x? 
(b) 6+ 11x + 6x? 
(c) 0 
(d) 7+ 8x + 9x? 
11. In each part, determine whether the given vectors span 23. 
(a) v1 = (2, 2, 2), v2 (0, 0, 3), v3 = (0, 1, 1) 
(b) vy = (2, = 1, 3), v2 = (4, 1, 2), жз = (8, — 1,8) 
(c) vy = (3, 1,4), v3 = (2, = 3, 5), жз = (5, = 2, 9), v4— (1,4, = 1) 
(d) v4 = (1, 2, 6), v2 = (3, 4, 1), уз = (4, 3, 1), vq = (5, 5, 1) 


Answer: 


(a) The vectors span 
(b) The vectors do not span 
(c) The vectors do not span 
(d) The vectors span 
12. Suppose that v; = (2, 1, 0, 3), v; = (3, = 1, 5, 2), and v3 = ( — 1, 0, 2, 1). Which of the following 
vectors are in span (v1, v2, v3) ? 
(a) (2. 3, —7, 3) 
(b) (0, 0, 0, 0) 


(c) (L1, 1, 1) 
(d) С—4,6, = 13,4) 


13. Determine whether the following polynomials span P5. 
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15. 


рр=1—х=+ 2х?, p2—3-4x, 
рз=5—х+ 4x?, рд= = 2 = 2x + 2x? 


Answer: 


The polynomials do not span 


. Let f = cos*x and g= sinx. Which of the following lie in the space spanned by f and g? 


(a) cos 2x 

(b) 34 x? 

(c) 1 

(d) sin x 

(e) 0 

Determine whether the solution space of the system 4x = Q is a line through the origin, a plane through the 


origin, or the origin only. If it is a plane, find an equation for it. If it is a line, find parametric equations for 
it. 


(a) —1 1 1 
А= —1 0 
2 —4 —5 
(b) 1 =2 3 
4=|-3 6 9 
—2 4 —6 
(c) La 
A=|2.5 3 
108 
(d) 1 2 -6 
А=|1 4 4 
3 10 6 
(e) 1-1 1 
А=|2 -1 4 
3 1:211 
(f) 1-3 1 
А=|2 -6 2 
3 —9 3 
Answer: 
А 1 3 
а) L И = = — = =- > mE 
(a) ine; x 25 y 25 2={ 
(b) Line; x = 2f, yi, z=0 
(c) Origin 


(d) Origin 


(e) Line; x = —3¢, y= = 2t, z—t 
(f) Plane; x 2 3y 4-z— 0 
16. (Calculus required) Show that the following sets of functions are subspaces of F ( — со, со). 
(a) All continuous functions on ( = oo, со). 
(b) All differentiable functions on ( — со, со). 
(c) All differentiable functions on ( — со, ox) that satisfy f^ + 2f = 0. 


17. (Calculus required) Show that the set of continuous functions f = Ў (x) on [a, b] such that 
a 


18. Show that the solution vectors of a consistent nonhomoge- neous system of m linear equations in л 
unknowns do not form a subspace of 2”. 


19. Prove Theorem 4.2.5. 


20. Use Theorem 4.2.5 to show that the vectors v; = (1, 6, 4), v2 = (2, 4, —1), v3 = ( — 1, 2, 5), and the 
vectors w] = (1, = 2, = 5), w3 = (0, 8, 9) span the same subspace of 23. 


dx = 0 


is a subspace of C[a, b]. 


True-False Exercises 
In parts (a)-(k) determine whether the statement is true or false, and justify your answer. 
(a) Every subspace of a vector space is itself a vector space. 

Answer: 


True 


(b) Every vector space is a subspace of itself. 
Answer: 


True 


(c) Every subset of a vector space V that contains the zero vector in V is a subspace of V. 
Answer: 


False 
(d) The set 22 is a subspace of 23. 


Answer: 


False 


(e) The solution set of a consistent linear system 4x — h of m equations in n unknowns is a subspace of R”. 


Answer: 


False 


(f) The span of any finite set of vectors in a vector space is closed under addition and scalar multiplication. 
Answer: 


True 


(g) The intersection of any two subspaces of a vector space V is a subspace of V. 
Answer: 


True 


(h) The union of any two subspaces of a vector space V is a subspace of V. 
Answer: 


False 


(i) Two subsets of a vector space V that span the same subspace of V must be equal. 
Answer: 


False 


(j) The set of upper triangular » x у matrices is a subspace of the vector space of all у x з matrices. 
Answer: 
True 

(К) The polynomials x = 1, (x = i and (x = i) span P4. 
Answer: 


False 
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4.3 Linear Independence 


In this section we will consider the question of whether the vectors in a given set are interrelated in the sense 
that one or more of them can be expressed as a linear combination of the others. This is important to know in 
applications because the existence of such relationships often signals that some kind of complication is likely 
to occur. 


Extraneous Vectors 


In a rectangular xy-coordinate system every vector in the plane can be expressed in exactly one way as a 
linear combination of the standard unit vectors. For example, the only way to express the vector (3, 2) as a 
linear combination ofi — (1, 0) and j — (0, 1) is 


(3, 2) = 3(1, 0) + 2(0, 1) = 3i 4- 2j (1) 


(Figure 4.3.1). Suppose, however, that we were to introduce a third coordinate axis that makes an angle of 45? 
with the x-axis. Call it the w-axis. As illustrated in Figure 4.3.2, the unit vector along the w-axis is 


1 1 
Ww-|——,— 
| y2 y2 | 
Whereas Formula 1 shows the only way to express the vector (3, 2) as a linear combination of i and j, there 
are infinitely many ways to express this vector as a linear combination of i, j, and w. Three possibilities are 


E ajeno 0.1 | 5. "E + 2j+ Ow 
e 2}=2(1.0 4 o j Des "E i+ y2w 
E 2}=a(1.0 | so |-#-Б- &J-* r3j— J2w 


In short, by introducing a superfluous axis we created the complication of having multiple ways of assigning 
coordinates to points in the plane. What makes the vector w superfluous is the fact that it can be expressed as 
a linear combination of the vectors i and j, namely, 


1 1 1. l. 
w= |->, = |= —=i + I) 
"E ү2 | {2 ү? 
Thus, one of our main tasks in this section will be to develop ways of ascertaining whether one vector in a set 
Sis a linear combination of other vectors in S. 


Figure 4.3.2 


Linear Independence and Dependence 


We will often apply the terms /inearly 
independent and linearly dependent to the 
vectors themselves rather than to the set. 


DEFINITION 1 


If S = (v1, v2, ..., V.) is a nonempty set of vectors in a vector space V, then the vector equation 
kivi + k2v2 +... Куу, = 0 

has at least one solution, namely, 
ky=0, k220,., &=0 


We call this the trivial solution. If this is the only solution, then S is said to be a linearly independent 
set. If there are solutions in addition to the trivial solution, then S is said to be a linearly dependent 
set. 


EXAMPLE 1 Linear Independence of the Standard Unit Vectors in R? Ж 


The most basic linearly independent set in R” is the set of standard unit vectors 
8(2(1,0,0,..,0), &22(0,1,0,..,0),.., 84,2(0,0,0,..., 1) 


For notational simplicity, we will prove the linear independence in 23 of 
i=(1,0,0), j=(0,1,0), k=(0,0,1) 


The linear independence or linear dependence of these vectors is determined by whether there exist non 
solutions of the vector equation 


kii + kaj + k3k = 0 


Since the component form of this equation is 
(k1, 82, з) = (0, 0, 0) 


it follows that у = #2 = &3 = 0. This implies that 2 has only the trivial solution and hence that the vec 
linearly independent. 


EXAMPLE 2 Linear Independence in R 4 


Determine whether the vectors 
vi—(l =2, 3), жз = (5, 6, = 1), v3= (3,2,1) 


are linearly independent or linearly dependent іп 23. 


Solution The linear independence or linear dependence of these vectors is determined by 
whether there exist nontrivial solutions of the vector equation 


kivi + k2v2 + k3v3 = 0 (3) 


or, equivalently, of 
k1(1, —2, 3) + £3(5, 6, — 1) + 43(3, 2, 1) = (0, 0, 0) 
Equating corresponding components on the two sides yields the homogeneous linear system 
1 + Ska + 3k4 = 0 
—2k, + 6k3 + 2k3 = 0 (4) 
Зі = 9 4 3 = 0 
Thus, our problem reduces to determining whether this system has nontrivial solutions. There 
are various ways to do this; one possibility is to simply solve the system, which yields 
1 1 
k= = =, k= ——t, зі 
1 2 ? 2 2 ? 3 
(we omit the details). This shows that the system has nontrivial solutions and hence that the 
vectors are linearly dependent. A second method for obtaining the same result is to compute the 
determinant of the coefficient matrix 
1 53 
А=|-2 62 
5 —1 1 
and use parts (b) and (g) of Theorem 2.3.8. We leave it for you to verify that det(.4) = 0, from 
which it follows 3 has nontrivial solutions and the vectors are linearly dependent. 


In Example 2, what relationship do you see 
between the components of V1, V2, and уз and 
the columns of the coefficient matrix А? 


EXAMPLE 3 Linear Independence in к^ 4 


Determine whether the vectors 
үј= (1,2,2, = 1), жз = (4,9,9, 2-4), v4—(5,8,9, —5) 


in 24 are linearly dependent or linearly independent. 


Solution The linear independence or linear dependence of these vectors is determined by 
whether there exist nontrivial solutions of the vector equation 


kivi + Жуз + k3v3 = 0 
or, equivalently, of 
&1(1, 2, 2, = 1) + £53(4, 9, 9, —4) + &3(5, 8, 9, — 5) = (0, 0, 0, 0) 
Equating corresponding components on the two sides yields the homogeneous linear system 
&1 462 + 583 =0 
281 + Ika + 843 = 0 
281 + Ika + 943 = 0 
—k,—4k;—5k4 = 0 
We leave it for you to show that this system has only the trivial solution 
б|=0, k220, &3=0 


from which you can conclude that V1, v2, and V3 are linearly independent. 


EXAMPLE 4 An Important Linearly Independent Set іп Pn + 


Show that the polynomials 


form a linearly independent set in Py. 


Solution For convenience, let us denote the polynomials as 


2 
ро = 1, рі = х, р2 = х к==р Ph =x" 


We must show that the vector equation 
agpo--a1pi--a2p2-- * * + Haypa —0 (5) 


has only the trivial solution 


апа=ар=а==+++=ай„==0 


But 5 is equivalent to the statement that 
ад + ajx +a? + +++ aux" —0 (6) 


for all x in ( — oo, оо), so we must show that this holds if and only if each coefficient in 6 is zero. 
To see that this is so, recall from algebra that a nonzero polynomial of degree n has at most n 
distinct roots. That being the case, each coefficient in 6 must be zero, for otherwise the left side of 
the equation would be a nonzero polynomial with infinitely many roots. Thus, 5 has only the 
trivial solution. 


The following example shows that the problem of determining whether a given set of vectors in P,, is linearly 
independent or linearly dependent can be reduced to determining whether a certain set of vectors in 2” is 
linearly dependent or independent. 


EXAMPLE 5 Linear Independence of Polynomials — 


Determine whether the polynomials 


pi=l—x, p2—54 3x — 2x7, рз= 14 3x — x? 


are linearly dependent or linearly independent in P». 


Solution The linear independence or linear dependence of these vectors is determined by 
whether there exist nontrivial solutions of the vector equation 


Kip, + 2р2 + &3p3 = 0 (7) 
This equation can be written as 
1-х) | ДЕ + 3x — 2x?) 4 ks | 3z —х°)=0 (8) 


or, equivalently, as 
(e + 5k24 ka) + (-h + 3k24 3ks | 4 (725; j^ -0 


Since this equation must be satisfied by all x in ( — оо, оо), each coefficient must be zero (as 
explained in the previous example). Thus, the linear dependence or independence of the given 
polynomials hinges on whether the following linear system has a nontrivial solution: 

ky + 592 + Ез = 0 

=k, + 3692 + 33 = 0 (9) 
2 9 = Кз = 0 

We leave it for you to show that this linear system has a nontrivial solutions either by solving it 
directly or by showing that the coefficient matrix has determinant zero. Thus, the set 
(P1. P2. рз) is linearly dependent. 


In Example 5, what relationship do you see 
between the coefficients of the given 
polynomials and the column vectors of the 
coefficient matrix of system 9? 


An Alternative Interpretation of Linear Independence 


The terms /inearly dependent and linearly independent are intended to indicate whether the vectors in a given 
set are interrelated in some way. The following theorem, whose proof is deferred to the end of this section, 
makes this idea more precise. 


THEOREM 4.3.1 


A set S with two or more vectors is 


(a) Linearly dependent if and only if at least one of the vectors in S is expressible as a linear 
combination of the other vectors in S. 


(b) Linearly independent if and only if no vector in S is expressible as a linear combination of the 
other vectors in S. 


EXAMPLE 6 Example 1 Revisited — 


In Example 1 we showed that the standard unit vectors іп д" are linearly independent. Thus, it 
follows from Theorem 4.3.1 that none of these vectors is expressible as a linear combination of 
the other two. To illustrate this in 2, suppose, for example, that 


k= 1+ ^2] 
or in terms of components that 
(0, 0, 1) = (&1, £2, 0) 


Since this equation cannot be satisfied by any values of ё and &2, there is no way to express К 
as a linear combination of i and j. Similarly, i is not expressible as a linear combination of j and 
К, and j is not expressible as a linear combination of i and К. 


EXAMPLE 7 Example 2 Revisited < 


In Example 2 we saw that the vectors 
vi—(1, 22,5, жә = (5, 6, = 1), v3= (3,2, 1) 


are linearly dependent. Thus, it follows from Theorem 4.3.1 that at least one of these vectors is 


expressible as a linear combination of the other two. We leave it for you to confirm that these 
vectors satisfy the equation 


iu + 1зз-уз=0 
from which it follows, for example, that 
1 1 
үз = 5*1 + БЫ 


Sets with One or Two Vectors 


The following basic theorem is concerned with the linear independence and linear dependence of sets with 
one or two vectors and sets that contain the zero vector. 


THEOREM 4.3.2 


(a) A finite set that contains 0 is linearly dependent. 
(b) Aset with exactly one vector is linearly independent if and only if that vector is not 0. 


(c) Aset with exactly two vectors is linearly independent if and only if neither vector is a scalar 
multiple of the other. 


Józef Hoéné de Wronski (1778-1853) 


Historical Note The Polish-French mathematician Jozef Hoéné de Wronski was born Jozef Hoéné 
and adopted the name Wronski after he married. Wronski's life was fraught with controversy and 
conflict, which some say was due to his psychopathic tendencies and his exaggeration of the 
importance of his own work. Although Wronski's work was dismissed as rubbish for many years, and 
much of it was indeed erroneous, some of his ideas contained hidden brilliance and have survived. 
Among other things, Wronski designed a caterpillar vehicle to compete with trains (though it was 


never manufactured) and did research on the famous problem of determining the longitude of a ship at 
sea. His final years were spent in poverty. 
[/ mage: wikipedia] 


We will prove part (a) and leave the rest as exercises. 


Proof (а) For any vectors v4, v3, ..., Vy, the set S = (v1, v2, ..., vy, 0} is linearly dependent since the 
equation 
Ov; + 092 + - - - + 09, + 1(0) =0 


expresses 0 as a linear combination of the vectors in S with coefficients that are not all zero. 


EXAMPLE 8 Linear Independence of Two Functions — 


The functions f у = x and Ёз = sin x are linearly independent vectors in F ( — oo, со) since 
neither function is a scalar multiple of the other. On the other hand, the two functions 

91 = sin 2x and gz = sin x cos x are linearly dependent because the trigonometric identity 
sin 2x = 2 sin x cos x reveals that 81 and £2 are scalar multiples of each other. 


A Geometric Interpretation of Linear Independence 


Linear independence has the following useful geometric interpretations in p? and 23: 


* Two vectors in 22 or &? are linearly independent if and only if they do not lie on the same line when they 
have their initial points at the origin. Otherwise one would be a scalar multiple of the other (Figure 4.3.3). 


(a) Linearly dependent (b) Linearly dependent (c) Linearly independent 


Figure 4.3.3 


* Three vectors in 22 are linearly independent if and only if they do not lie in the same plane when they have 


their initial points at the origin. Otherwise at least one would be a linear combination of the other two 
(Figure 4.3.4). 


(a) Linearly dependent (b) Linearly dependent (c) Linearly independent 


Figure 4.3.4 


At the beginning of this section we observed that a third coordinate axis in 22 is superfluous by showing that 


a unit vector along such an axis would have to be expressible as a linear combination of unit vectors along the 
positive x- and y-axis. That result is a consequence of the next theorem, which shows that there can be at most 
n vectors in any linearly independent set 2”. 


It follows from Theorem 4.3.3, for example, 
that a set in 22 with more than two vectors is 


linearly dependent and a set in 22 with more 
than three vectors is linearly dependent. 


THEOREM 4.3.3 


Let S = (v4, V2, -- V.) be a set of vectors in R”. If r > y, then S is linearly dependent. 


Proof Suppose that 


vi = (У11,У12, t, Vin) 

ү) = (V2 Y2 °° ". V2) 

у, = (VY °° 1. Ven) 
and consider the equation 

kivi Hiva t °° c +%,v,=0 


If we express both sides of this equation in terms of components and then equate the corresponding 
components, we obtain the system 


viiki Hyak + "ФУ = 0 
у1261 Hyak + s c +k, = 0 


Vinkt + Vank2+ to coco vk, = 0 
This is a homogeneous system of л equations in the r unknowns %1, ..., Æp. Since к = з, it follows from 
Theorem 1.2.2 that the system has nontrivial solutions. Therefore, 5 = (v1, v2,..., v.) is a linearly 
dependent set. 


CALCULUS REQUIRED 
Linear Independence of Functions 


Sometimes linear dependence of functions can be deduced from known identities. For example, the functions 
fi = sin?x, f? = cos?x, and f3=5 
form a linearly dependent set in F ( — c, оо), since the equation 
5f,--5f;—f4 = 5sin^x + 5cos?x — 5 
=5 (sin^x + cosx) —5=0 
expresses 0 as a linear combination of f |, Ёз, and f 4 with coefficients that are not all zero. 
Unfortunately, there is no general method that can be used to determine whether a set of functions is linearly 


independent or linearly dependent. However, there does exist a theorem that is useful for establishing linear 
independence in certain circumstances. The following definition will be useful for discussing that theorem. 


DEFINITION 2 


Iff = ў бх), fa — забх), fp = f (x) are functions that are y — | times differentiable on the 
interval ( — со, c ), then the determinant 


Fi) F2(x) ++ Дух) 

Ae) fix) coc f) 
W(x) = |; ; | 

ng n р 


is called the Wronskian of ў |, ў 3, --.. Ху: 


Suppose for the moment that fy = F (х), Ёз = £3(x),.., En =f nlx) are linearly dependent vectors in 
pu | — со, оо} This implies that for certain values of the coefficients the vector equation 
kifi 2+ cc +4,f,=0 


has a nontrivial solution, or equivalently that the equation 


kiila) + ka/ 2б) c c b ESO) —0 
is satisfied for all x in ( — со, со). Using this equation together with those that result by differentiating it 
мп — 1 times yields the linear system 


kiil) — ^ kafa(x) +++ fA n(x) =0 
kifi) kf) ccc ok (к) _% 


nfo dE до) JE TERE ON ды J —0 


Thus, the linear dependence of f 1, Ёз, ..., Ё, implies that the linear system 


іб) F(x) ++ fy (x) Я 
1 
AG) ле) лә) [uu lo 
: : i "Me EF п) 
э) л?) Glee] to 


has a nontrivial solution. But this implies that the determinant of the coefficient matrix of 10 is zero for every 
such x. Since this determinant is the Wronskian of f |, f 2, -.., J n, we have established the following result. 


THEOREM 4.3.4 


If the functions f 1, Ёз, ..., Ё, have у — 1 continuous derivatives on the interval ( — со, oc), and if the 
Wronskian of these functions is not identically zero on ( — со, оо), then these functions form a 


linearly independent set of vectors in pug | — со, оо} 


In Example 8 we showed that x and sin x are linearly independent functions by observing that neither is a 
scalar multiple of the other. The following example shows how to obtain the same result using the Wronskian 
(though it is a more complicated procedure in this particular case). 


EXAMPLE 9 Linear Independence Using the Wronskian — 
Use the Wronskian to show that f = x and Ёз = sin x are linearly independent. 
Solution The Wronskian is 


ud i 


This function is not identically zero on the interval ( — оо, o) since, for example, 


WE) = f»()-9(5)-3 


Thus, the functions are linearly independent. 


x snx | 
=x cos x = sin X 
1 cosx 


WARNING 


The converse of Theorem 4.3.4 is false. If the 
Wronskian of f |, Ёз, ..., Ё, is identically zero 
on ( = со, оо), then no conclusion can be 
reached about the linear independence of 
(£1, £5,.., fy} — this set of vectors may be 
linearly independent or linearly dependent. 


EXAMPLE 10 Linear Independence Using the Wronskian + 
Use the Wronskian to show that f4 = 1, f; = e^, and Ёз = "m are linearly independent. 


Solution The Wronskian is 
1 2 22? 
(х) = |0 2% 22| = 227 
0 е" 4e? 


This function is obviously not identically zero on ( — oo, оо), so f у, Ёз, and Ёз form a linearly 
independent set. 


OPTIONAL 


We will close this section by proving part (a) of Theorem 4.3.1. We will leave the proof of part (b) as an 
exercise. 


Proof of Theorem 4.3.1 (a) Let S = (v4, v2,..., v.) be a set with two or more vectors. If we assume 
that S is linearly dependent, then there are scalars Ё, #2, ..., ky, not all zero, such that 


үу] | kava Free: k,v, = 0 (11) 


To be specific, suppose that %4 # 0. Then 11 can be rewritten as 


vV] = (-}: „иы ы.» [- 2)» 


which expresses V as a linear combination of the other vectors in S. Similarly, if kj + О in 11 for some 
j — 2, 5, ..., 7, then Vj is expressible as a linear combination of the other vectors in S. 


Conversely, let us assume that at least one of the vectors in S is expressible as a linear combination of the 
other vectors. To be specific, suppose that 

Vq—0C2V2-FC3V3- ^ * * cb OC, 
so 


yq|—02V2—03V3— | c * —c,v = 0 
It follows that S is linearly dependent since the equation 
kivi Heva t °° c Ev = 0 
is satisfied by 
Ay=1, ka= —cq., k= —су 


which are not all zero. The proof in the case where some vector other than V, is expressible as a linear 
combination of the other vectors in S is similar. 


Concept Review 

* Trivial solution 

* Linearly independent set 

* Linearly dependent set 

* Wronskian 

Skills 

* Determine whether a set of vectors is linearly independent or linearly dependent. 

* Express one vector in a linearly dependent set as a linear combination of the other vectors in the set. 


* Use the Wronskian to show that a set of functions is linearly independent. 


Exercise Set 4.3 


1. Explain why the following are linearly dependent sets of vectors. (Solve this problem by inspection.) 
(a) uy = (= 1, 2, 4) and uz = (5, — 10, — 20) in g? 
(b) щ = (3, = 1), u = (4, 5), u3 = ( = 4, 7) ing? 
(c) pp =3—2x4 x? and p = 6 — 4x 4 2x? in P5 


d -3 4 3 —4|. 
& a=| ә ILLO 5 | in Мп 


Answer: 


(а) U2 is a scalar multiple of u4. 

(b) The vectors are linearly dependent by Theorem 4.3.3. 
(c) P2 is a scalar multiple of P1. 

(d) Bis a scalar multiple of A. 


2. Which of the following sets of vectors in 23 are linearly dependent? 
(a) (4, —1, 2), (—4,10,2) 


A 


Un 


© 


4 


œ) (3,0,4), ©, —1,2), (1,1,3) 
(с) = 1,3), (4,0,1) 
(йд oma; UTE (3,2,5), (6, — 1,1), (7,0, = 2) 


Which of the following sets of vectors in R4 are linearly dependent? 

(а) (3,8,7, = 3), (1, 5, 3, = 1), (2, = 1, 2, 6), (1, 4, 0, 3) 

(b) (0, 0, 2, 2), (3, 3, 0, 0), (1, 1,0, — 1) 

(c) (0,3, = 3, = 6), (= 2, 0, 0, —6), (0, = 4, = 2, = 2), (0, – 3, 4, – 4) 
(d) (3,0, = 3, 6), (0, 2, 3, 1), (0, = 2, = 2, 0), (= 2, 1, 2, 1) 


Answer: 


None 


. Which of the following sets of vectors in P4 are linearly dependent? 


(a) 2—x -- 4x5, 3 + 6x + 2х2, 2+ 10x — 4x? 

(b) 3--x 4-x22—x-- 5х2, 4 3x? 

(c) 6— x? 

(d) 1+ 3x + 3х2, x + 4x? 5 + 6x + 3x2. 7 + 2x х2 


. Assume that V1, V2, and ¥3 are vectors in 23 that have their initial points at the origin. In each part, 


determine whether the three vectors lie in a plane. 
(а) м = (2, — 2, 0), v2 = (6, 1, 4), v3 = (2, 0, —4) 
(b) vy = (—6, 7, 2), v2 = (3, 2, 4), уз = (4, — 1, 2) 


Answer: 


(a) They do not lie in a plane. 
(b) They do lie in a plane. 


. Assume that V1, V2, and V3 are vectors in R? that have their initial points at the origin. In each part, 


determine whether the three vectors lie on the same line. 

(a) vy =(= 1, 2, 3), v2 = (2, —4, = 6), v3 = (= 3, 6, 0) 
(b) vy = (2, = 1, 4), v2 = (4, 2, 3), v3 = (2, 7, — 6) 

(c) vy = (4, 6, 8), v2 = (2, 3,4), v3 = (—2, = 3, – 4) 


*(a) Show that the three vectors v4 = (0, 5, 1, — 1), v3 = (6, 0, 5, 1), and v4 = (4, = 7, 1, 3) forma 


linearly dependent set in g^. 


(b) Express each vector in part (a) as a linear combination of the other two. 


Answer: 


(D v= 2v - vs у= 2v t Зуз, v=- iu is 203 


2 3 


8. (a) Show that the three vectors v; = (1, 2, 5, 4), v2 = (0, 1, 0, — 1), and va = (1, 5, 3, 3) forma 
linearly dependent set in 24. 


(b) Express each vector in part (a) as a linear combination of the other two. 


9. For which real values of A do the following vectors form a linearly dependent set in 22? 


"n-[^ – 5, -5) м= |- 2 -5) = 1-2 - 5А) 


10. Show that if (v4, v2, v3) isa linearly independent set of vectors, then so аге 
(vi. v2) . (v1, v3}, (v2, 3), (vi), (v2) , and (жз). 

. Show that if S — (v4, v2, -- vy} isa linearly independent set of vectors, then so is every nonempty 
subset of S. 


1 


m 


12. Show that if S = (v, v2, v3} isa linearly dependent set of vectors in a vector space V, and v4 is any 
vector in V that is not in S, then (v4, v2, v3, v4) is also linearly dependent. 


13. Show that if 5 = (v, v2,..., v.) is a linearly dependent set of vectors іп a vector space V, and if 
Vy], --› Vp are any vectors in V that are not in S, then (vi, V2, ..., Уу, Vy41,-- Vy} is also linearly 
dependent. 


14. Show that in P? every set with more than three vectors is linearly dependent. 


15. Show that if (v4, v3) is linearly independent and уз does not lie in span (v4, v2) , then (v1, v2, v3) is 
linearly independent. 


16. Prove: For any vectors и, v, and w in a vector space V, the vectors u = y, y = w, and yy — u form a 
linearly dependent set. 


17. Prove: The space spanned by two vectors in д2 is a line through the origin, a plane through the origin, or 
the origin itself. 


18. Under what conditions is a set with one vector linearly independent? 


19. Are the vectors V1, V2, and V3 in part (a) of the accompanying figure linearly independent? What about 
those in part (5)? Explain. 


(5) 


Figure Ex-19 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


Answer: 


(a) They are linearly independent since уу, уз, and уз do not lie in the same plane when they are placed 
with their initial points at the origin. 


(b) They are not linearly independent since v4, ъз, and V3 line in the same plane when they are placed 
with their initial points at the origin. 

By using appropriate identities, where required, determine which of the following sets of vectors in 

F( — со, со) are linearly dependent. 

(a) 6, 3 sin^x, 2 cos?x 

(b) Х, cos x 

(c) 1, snx, sm2x 

(d) cos 2x, sin^x, соѕ2х 

(е) (3-х), х2 6х, 5 

(f) 0, COS TX, іп? Зх 


The functions f(x) = х and f зх) = cos x are linearly independent in F ( — со, o) because neither 
function is a scalar multiple of the other. Confirm the linear independence using Wronski's test. 


Answer: 
W(x} = =x sin x — соз x #0 for some x. 
The functions f(x) = sm x and f 3(x) = cos x are linearly independent in F ( — оо, со) because 


neither function is a scalar multiple of the other. Confirm the linear independence using Wronski's test. 


(Calculus required) Use the Wronskian to show that the following sets of vectors are linearly 
independent. 


(ay l, x, æ 
(b) 1, x, x? 


x 


Answer: 

(a) W(x) =g" #0 

(b) W(x) 2220 

Show that the functions 71 (x ) =e f3 (x) = xe, and f3 (x) = x7e are linearly independent. 

Show that the functions ў у(х) = sin x, f 3(x) = cos x, and f з(х) =x cos x are linearly independent. 
Answer: 


Wix) —2 sin x #0 for some x. 


Use part (a) of Theorem 4.3.1 to prove part (b). 


27. Prove part (5) of Theorem 4.3.2. 


28. (a) In Example 1 we showed that the mutually orthogonal vectors i, j, and k form a linearly independent 
set of vectors іп 23. Do you think that every set of three nonzero mutually orthogonal vectors in 23 is 


linearly independent? Justify your conclusion with a geometric argument. 


(b) Justify your conclusion with an algebraic argument. [Hint: Use dot products. | 
True-False Exercises 


In parts (a)-(h) determine whether the statement is true or false, and justify your answer. 
(a) A set containing a single vector is linearly independent. 
Answer: 


False 


(b) The set of vectors (v, kv) is linearly dependent for every scalar k. 
Answer: 


True 


(c) Every linearly dependent set contains the zero vector. 
Answer: 


False 


(d) If the set of vectors (v4, уз, v3) is linearly independent, then (kv, уз, уз) is also linearly 
independent for every nonzero scalar К. 


Answer: 


True 


e) If v1, ..., v, are linearly dependent nonzero vectors, then at least one vector Vi is a unique linear 
1 n y dep q 
combination of v1, ..., Vk 1 


Answer: 


True 


(f) The set of 2 x 2 matrices that contain exactly two I's and two 0's is a linearly independent set in M 25. 
Answer: 


False 


(g) The three polynomials (x = 1) (x + 2), x(x + 2), and x(x — 1) are linearly independent. 
Answer: 


True 


(h) The functions 7 | and 7 з are linearly dependent if there is a real number x so that 
kif 10) + ka alx) = 0 for some scalars & and 2. 


Answer: 


False 
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4.4 Coordinates and Basis 


We usually think of a line as being one-dimensional, a plane as two-dimensional, and the space around us as three- 
dimensional. It is the primary goal of this section and the next to make this intuitive notion of dimension precise. 
In this section we will discuss coordinate systems in general vector spaces and lay the groundwork for a precise 
definition of dimension in the next section. 


Coordinate Systems in Linear Algebra 


In analytic geometry we learned to use rectangular coordinate systems to create a one-to-one correspondence 
between points in 2-space and ordered pairs of real numbers and between points in 3-space and ordered triples of 
real numbers (Figure 4.4.1). Although rectangular coordinate systems are common, they are not essential. For 
example, Figure 4.4.2 shows coordinate systems in 2-space and 3-space in which the coordinate axes are not 
mutually perpendicular. 


| I 

| Coordinates of P in a rectangular | Coordinates of P in a rectangular | 

| coordinate system in 2-space. coordinate system in 3-space. | 
Figure 4.4.1 


Coordinates of P in a nonrectangular 


Coordinates of P in a nonrectangular | 
coordinate system in 2-space. c 


oordinate system in 3-space 


Figure 4.4.2 


In linear algebra coordinate systems are commonly specified using vectors rather than coordinate axes. For 
example, in Figure 4.4.3 we have recreated the coordinate systems in Figure 4.4.2 by using unit vectors to identify 
the positive directions and then attaching coordinates to a point P using the scalar coefficients in the equations 


OP = au, + bug and OP = au, + bu + сиз 


P(a, b) 


av, 


Figure 4.4.3 


Units of measurement are essential ingredients of any coordinate system. In geometry problems one tries to use 
the same unit of measurement on all axes to avoid distorting the shapes of figures. This is less important in 
applications where coordinates represent physical quantities with diverse units (for example, time in seconds on 
one axis and temperature in degrees Celsius on another axis). To allow for this level of generality, we will relax 
the requirement that unit vectors be used to identify the positive directions and require only that those vectors be 
linearly independent. We will refer to these as the “basis vectors" for the coordinate system. hi summary, it is the 
directions of the basis vectors that establish the positive directions, and it is the lengths of the basis vectors that 
establish the spacing between the integer points on the axes (Figure 4.4.4). 


Equal spacing | Unequal spacing | Equal spacing | | Unequal sp 
Perpendicular axes Perpendicular axes | Skew axes | Skew axes 
Figure 4.4.4 


Basis for a Vector Space 


The following definition will make the preceding ideas more precise and will enable us to extend the concept of a 
coordinate system to general vector spaces. 


Note that in Definition 1 we have required a basis 
to have finitely many vectors. Some authors call 
this a finite basis, but we will not use this 
terminology. 


DEFINITION 1 
If V is any vector space and S = (v1, v3, ..., Vy} is a finite set of vectors in V, then S is called a basis for 
V if the following two conditions hold: 


(a) S is linearly independent. 
(b) S spans V. 


If you think of a basis as describing a coordinate system for a vector space in V, then part (a) of this definition 
guarantees that there is no interrelationship between the basis vectors, and part (b) guarantees that there are 
enough basis vectors to provide coordinates for all vectors in V. Here are some examples. 


EXAMPLE 1 The Standard Basis for R” — 


Recall from Example 11 of Section 4.2 that the standard unit vectors 
e; = (1, 0, 0,... 0), ез = (0, 1, 0,..., 0),... e,= (0, 0, 0,..., 1) 
span R" and from Example 1 of Section 4.3 that they are linearly independent. Thus, they form a 
basis for R” that we call the standard basis for R”. In particular, 
1= (1,0,0), j2(0,1,0, К=(0,0,1) 
is the standard basis for 23. 


EXAMPLE 2 The Standard Basis for Pn « 


Show that 5 = fi, X, х“ - х") is a basis for the vector space P,, of polynomials of degree n or 
less. 

Solution We must show that the polynomials in S are linearly independent and span Р,,. Let us 
denote these polynomials by 


pocl pi-z pi-zx^., ри=х" 
We showed in Example 13 of Section 4.2 that these vectors span ,, and in Example 4 of Section 
4.3 that they are linearly independent. Thus, they form a basis for P,, that we call the standard basis 
for Py. 


EXAMPLE 3 Another Basis for R < 
Show that the vectors v, = (1, 2, 1), уз = (2, 9, 0), and v3 = (3, 3, 4) form a basis for Д3. 


Solution We must show that these vectors are linearly independent and span 22. To prove linear 


independence we must show that the vector equation 


сур cava + сзуз = 0 (1) 


has only the trivial solution; and to prove that the vectors span 25 we must show that every vector 
b = (b1, b5, b3) in R? can be expressed as 


civi + c2V2 + c3v3 = b (2) 


By equating corresponding components on the two sides, these two equations can be expressed as 
the linear systems 


ci 7-202 + 3043 = 0 ci 2-202 + 503 = Ьу 
2c|--9c3--3c3— 0 and 2c,-4-9c3-4- 303 = Ёз (3) 
ci + 4с3 = 0 C1 + 463 = ba 


(verify). Thus, we have reduced the problem to showing that in 3 the homogeneous system has only 
the trivial solution and that the nonhomogeneous system is consistent for all values of h4, b3, and ёз 
. But the two systems have the same coefficient matrix 


1 2 3 
A=|2 9 3 
104 
so it follows from parts (b), (e), and (g) of Theorem 2.3.8 that we can prove both results at the same 
time by showing that det(A) # 0. We leave it for you to confirm that det(A) = — 1, which proves 


that the vectors V1, V2, and ©з form a basis for ВЗ. 


EXAMPLE 4 The Standard Basis for Мљп <4 


Show that the matrices 


form a basis for the vector space M 55 of 2 x 2 matrices. 


Solution We must show that the matrices are linearly independent and span M 25. To prove linear 
independence we must show that the equation 


c1 M,--c2M3 --c3M3--c4M 4—0 (4) 


has only the trivial solution, where 0 is the 2 x 2 zero matrix; and to prove that the matrices span 
M эз we must show that every 2 x 2 matrix 


Ф 


c1 M,--c2M3 + с3Мз + сдМад = В (5) 


can be expressed as 


The matrix forms of Equations 4 and 5 are 
and 


which can be rewritten as 


ci €2 0 0 ci €2 a b 
— and — 

€3 СА 0 0 €3 C4 c d 

Since the first equation has only the trivial solution 
C1202204 2064 
the matrices are linearly independent, and since the second equation has the solution 
Cy=a, c3—b, сз=с, c4—d 

the matrices span M 25. This proves that the matrices M, Mf, M3, M 4 form a basis for M 25. 


More generally, the mn different matrices whose entries are zero except for a single entry of 1 form 
a basis for M my called the standard basis for M yyy). 


Some writers define the empty set to be a basis 
for the zero vector space, but we will not do so. 


It is not true that every vector space has a basis in the sense of Definition 1. The simplest example is the zero 
vector space, which contains no linearly independent sets and hence no basis. The following is an example of a 
nonzero vector space that has no basis in the sense of Definition 1 because it cannot be spanned by finitely many 
vectors. 


EXAMPLE 5 AVector Space That Has No Finite Spanning Set + 
Show that the vector space of P... of all polynomials with real coefficients has no finite spanning set. 


Solution If there were a finite spanning set, say S= (pi, рз, --., Py} , then the degrees of the 
polynomials in S would have a maximum value, say л; and this in turn would imply that any linear 
combination of the polynomials in S would have degree at most n. Thus, there would be no way to 
express the polynomial „+! as a linear combination of the polynomials in S, contradicting the fact that 


the vectors in S span P... 


For reasons that will become clear shortly, a vector space that cannot be spanned by finitely many vectors is said 
to be infinite-dimensional, whereas those that can are said to be finite-dimensional. 


EXAMPLE 6 SomeFinite-and Infinite-Dimensional Spaces + 


In Example 1, Example 2, and Example 4 we found bases for А”, Py, and M mp so these vector 
spaces are finite-dimensional. We showed in Example 5 that the vector space P... is not spanned by 
finitely many vectors and hence is infinite-dimensional. In the exercises of this section and the next 
we will ask you to show that the vector spaces R^, F ( — oo, oo), C( — со, со), c" (— со, со), апа 


C" = оо, оо) are infinite-dimensional. 


Coordinates Relative to a Basis 


Earlier in this section we drew an informal analogy between basis vectors and coordinate systems. Our next goal is 
to make this informal idea precise by defining the notion of a coordinate system in a general vector space. The 
following theorem will be our first step in that direction. 


THEOREM 4.4.1 Uniqueness of Basis Representation 


If S = (v1, V2, ..., V4) is а basis for a vector space V, then every vector v in V can be expressed in the 
form V = C4V4 4-C2V2 + * * * + ¢yV¥y in exactly one way. 


Proof Since S spans V, it follows from the definition of a spanning set that every vector in V is expressible as a 
linear combination of the vectors in S. To see that there is only one way to express a vector as a linear combination 
of the vectors in S, suppose that some vector v can be written as 


V—CqQV| Few + * * * Cy Vy 
and also as 

v= уу Hiwat * c c Eu, 
Subtracting the second equation from the first gives 

0 = (су = k1)vi + (c2 — k2)v2 ^ + + + (Cn ky) Yn 
Since the right side of this equation is a linear combination of vectors in S, the linear independence of S implies 
that 
c(—k(—0, c39—k9—0,.., су у = 0 

that is, 

ci— 1, c32—KE4.., бу у 


Thus, the two expressions for v are the same. 


Figure 4.4.5 


Sometimes it will be desirable to write a 
coordinate vector as a column matrix, in which 
case we will denote it using square brackets as 

ci 

c2 

[v]s=| . 

Cyn 
We will refer to [v] s as a coordinate matrix and 
reserve the terminology coordinate vector for the 
comma delimited form (v) s. 


We now have all of the ingredients required to define the notion of “coordinates” in a general vector space V. For 
motivation, observe that in E for example, the coordinates (a, b, c) of a vector v are precisely the coefficients in 


the formula 
v=ai+ dj+ck 
that expresses v as a linear combination of the standard basis vectors for R? (see Figure 4.4.5). The following 


definition generalizes this idea. 


DEFINITION 2 


If S — (vi, V2,..., V4) 15 а basis for a vector space V, and 

V —C|V| FEW + ^ * * c Cy VR 
is the expression for a vector v in terms of the basis S, then the scalars с, c2, ..., су are called the 
coordinates of v relative to the basis S. The vector (с, сз, ..., Cy) in E" constructed from these 
coordinates is called the coordinate vector of v relative to S; it is denoted by 


(м) = a ee Cy) (6) 


Remark Recall that two sets are considered to be the same if they have the same members, even if those 


members are written in a different order. However, if 5 = (v, ¥2, -- v4) isa set of basis vectors, then changing 
the order in which the vectors are written would change the order of the entries in (v) s, possibly producing a 
different coordinate vector. To avoid this complication, we will make the convention that in any discussion 
involving a basis S the order of the vectors in S remains fixed. Some authors call a set of basis vectors with this 
restriction an ordered basis. However, we will use this terminology only when emphasis on the order is required 
for clarity. 


Observe that (v) s is a vector in R”, so that once basis < is given for a vector space V, Theorem 4.4.1 establishes a 
one-to-one correspondence between vectors in V and vectors in R” (Figure 4.4.6). 


А one-to-one correspondence 


V К" 
Figure 4.4.6 


EXAMPLE 7 Coordinates Relative to the Standard Basis for R^ < 


In the special case where | = R” and S is the standard basis, the coordinate vector (v) 5 and the vector 
v are the same; that is, 


у= (у) 
For example, in 25 the representation of a vector v = (a, b, c) as a linear combination of the vectors in 
the standard basis 5 = (1,j, k} is 
v=ai+ dj+ck 


so the coordinate vector relative to this basis is (v) & — (a, b, с), which is the same as the vector v. 


EXAMPLE 8 Coordinate Vectors Relative to Standard Bases «4 


(а) Find the coordinate vector for the polynomial 
p(x}=co F cix + сах? Ett сух" 


relative to the standard basis for the vector space P. 


E 


(b) Find the coordinate vector of 


relative to the standard basis for M 25. 


Solution 


(2) The given formula for p(x) expresses this polynomial as a linear combination of the standard 


2 


basis vectors S = 4 1, x, х“,..„ X” >, Thus, the coordinate vector for p relative to S is 


(р) = (co, C1. C2. ELI Cy) 


(b) We showed in Example 4 that the representation of a vector 


"o 


as a linear combination of the standard basis vectors is 


ЕЕН 


so the coordinate vector of В relative to S is 


(8) g= (a, 5, c, а) 


EXAMPLE 9 Coordinates in R? < 


(а) We showed in Example 3 that the vectors 
v|—(1,2,1), v22(2,9,0) v3= (3,3,4) 
form a basis for 23. Find the coordinate vector of v = (5, — 1, 9) relative to the basis 
S= (v4, V2, V3) . 


(b) Find the vector v іп R? whose coordinate vector relative to S is (v) у= ( — 1, 3, 2). 


Solution 


(a) To find (v) = we must first express v as a linear combination of the vectors in 5; that 15, we must 
find values of c1, 72, and ¢3 such that 
v= cV] + C2V2 + C3V3 
or, in terms of components, 


(5, 1,9) =с1(1,2, 1) +с2(2, 9, 0) +с3(3, 3, 4) 


Equating corresponding components gives 


ci 7-202 + Зсз = 5 
261 +9¢2+3¢e3 = —1 
c1 +4ez = 9 
Solving this system we obtain су = 1, c3 = — 1, c3 = 2 (verify). Therefore, 


(v)s— (1, – 1, 2) 
(b) Using the definition of (v) s, we obtain 
v —(-—1)vi + 5v3 + 2v3 
—(-—1)(1,2, 1) + 3(2, 9, 0) + 2(3, 3, 4) = (11, 31, 7) 


Concept Review 

* Basis 

* Standard bases for А”, Py, M mn 

* Finite-dimensional 

* [nfinite-dimensional 

* Coordinates 

* Coordinate vector 

Skills 

* Show that a set of vectors is a basis for a vector space. 
* Find the coordinates of a vector relative to a basis. 


* Find the coordinate vector of a vector relative to a basis. 


Exercise Set 4.4 


1. In words, explain why the following sets of vectors are not bases for the indicated vector spaces. 
(a) uy = (1, 2), uz = (0, 3), из = (2, 7) for g? 
(b) uy = (—1, 3, 2), ug = (6, 1, 1) for R? 
(с) ру = 1 х 4х2, рз = х 1 for P5 


à , [1 1] 4 | 6 0] S [3 0] 4 [5 1] z [? 1 
а= |; j|? та [тт |2 [ә ә rM 


Answer: 


(a) A basis for 22 has two linearly independent vectors. 
(b) A basis for R? has three linearly independent vectors. 


(c) A basis for P^ has three linearly independent vectors. 
(d) A basis for M 5» has four linearly independent vectors. 


2. Which of the following sets of vectors are bases for 22? 
(а) ((2, 1), (3, 0)) 
(b) (4, 1). (= 7, = 8)) 
(с) ((0, 0), (1, 3)) 
(d) (G, 9), (= 4, = 12)} 
3. Which of the following sets of vectors are bases for 22? 
(а) (C1, 0, 0), (2, 2, 0), (3, 5, 3)) 
Фф) A43, T, — 3), (2,5,6), (1,4,8)) 
(с) Me. m3 1), (4, 1, 1), (0, —7,1)) 


л 


© 


oo | 


1th 6,4), (2,4, —1),(—1,2,5)) 
Answer: 


(a), (b) 


‚ Which of the following form bases for P2? 


(а) 1—3x4+2x7, 14x44x*, 1—7х 
(b 446x4x7, —144x4+2x7, 542x—x? 
(с) 14x4x^ xx) x? 


(d —4-Ex--3x?7, 6+ 5х + 2х2, 84x x? 


. Show that the following matrices form a basis for M 55. 


э sb [1 ор (2 | 


. Let V be the space spanned by v= сов2х› ул = sin^x. V3 = соз 2x. 


(a) Show that S= (v4, v2, v3) is not a basis for V. 
(b) Find a basis for V. 


. Find the coordinate vector of w relative to the basis S= (uj, uz} for д2. 


(a) щ = (1, 0), ш = (0, 1); w= (3, — 7) 
(b) щ = (2, – 4), ш = (3, 8); w= (1, 1) 
(c) щ = (1, 1), ш = (0, 2); w= (a, b) 


Answer: 
(а) w) s— (3, —7) 
9 вөз=[&. 2) 


© (з= |a, 454) 


. Find the coordinate vector of w relative to the basis 5 — (uj, uz} of д2. 


(a) щ = (1, = 1), ш = (1, 1); w= (1, 9) 
(b) щ = (1, – 1), ш = (1, 1); w= (0, 1) 
(c) u; — (1, – 1), ш = (1, D; w= (1, 1) 


. Find the coordinate vector of v relative to the basis 5 = (v4, v3, v3) . 


(а) у= (2, = 1, 3); v1 = (1,0, 0), vo = (2, 2, 0), v3 = (3, 3, 3) 
(b) у= (5, = 12, 3); үү = (1, 2, 3), з= (—4,5,6), з= (7, — 8,9) 


Answer: 


(а) )г= (3, 72,1) 
(b) (0 g= (72, 0, 1) 


1 0 
-1 2 


| 


10. Find the coordinate vector of p relative to the basis 5 = (pi, рз, рз). 
(а) p24-3x-x^pi— 1,p272, p4 = x? 
(b) p—-2-x | x*;py=1 +X, pp=1+x2,p3=x4 x? 
11. Find the coordinate vector of A relative to the basis S= {.Aj, Az, Аз, Ад}. 
2 0 =1 1 1 1 
А= мі == , 4 = А 


Аз 


Ш 
f 
— © 
c cC 
L1 
ths 
D 
Il 
[-——*3 
о © 
— © 
LLL 


Answer: 
(A) g=(-1,1, —1,3) 


In Exercises 12-13, show that (.41, A2, Аз, Aq} is a basis for № 3, and express A as a linear combination of the 
basis vectors. 


Answer: 
A= Ај = 4) + А3 — Ад 


In Exercises 14—15, show that (p, p2, рз) is a basis for Ру, and express p as a linear combination of ће basis 
vectors. 


14. p = 1 + 2x + x2, p2 = 2-- 9х, рз = 3 + Зх + 4x2 p = 2 17x — 3x7 
I5. p, = 1+х + x^, рз = х + х2, рз =х5р=7 х + 2х2 
Answer: 


р = 7p; — 5p2 + 3p3 
16. The accompanying figure shows a rectangular xy-coordinate system and an x' y'-coordinate system with 
skewed axes. Assuming that 1-unit scales are used on all the axes, find the x' y'-coordinates of the points 
whose xy-coordinates are given. 
(a) 1,1) 
(b) (1, 0) 
(c) (0, 1) 
(d) (a P) 


x and x' 


Figure Ex-16 


17. The accompanying figure shows a rectangular xy-coordinate system determined by the unit basis vectors i and 
jandan x' y'-coordinate system determined by unit basis vectors uj and U3. Find the x' y'-coordinates of the 


points whose xy-coordinates are given. 
© (1 

(b) (1, 0) 

(c) (0, 1) 

(d) (a, b) 


Figure Ex-17 


Answer: 


(a) (2,0) 

O2 1 
үз үз 

(с) (0,1) 

@ а arl 


B B 


18. The basis that we gave for Л 55 in Example 4 consisted of noninvertible matrices. Do you think that there is a 
basis for Af consisting of invertible matrices? Justify your answer. 


19. Prove that R^ is infinite-dimensional. 


True-False Exercises 


In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 


(a) If = span (v4, ..., v4) , then (v4,..., v4) is a basis for V. 


Answer: 


False 


(b) Every linearly independent subset of a vector space V is a basis for V. 
Answer: 


False 


(c) If (v4, v2, ..., v4) isa basis for a vector space V, then every vector in V can be expressed as a linear 
combination of v4, v3, ..., Vy 


Answer: 


True 


(d) The coordinate vector of a vector x in R” relative to the standard basis for R” is x. 
Answer: 


True 


(e) Every basis of P4 contains at least one polynomial of degree 3 or less. 
Answer: 


False 
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4.5 Dimension 


We showed in the previous section that the standard basis R" has n vectors and hence that the standard basis 


; : 1 
for д3 has three vectors, the standard basis for 22 has two vectors, and the standard basis for È (= R) has one 


vector. Since we think of space as three dimensional, a plane as two dimensional, and a line as one 
dimensional, there seems to be a link between the number of vectors in a basis and the dimension of a vector 
space. We will develop this idea in this section. 


Number of Vectors in a Basis 


Our first goal in this section is to establish the following fundamental theorem. 


THEOREM 4.5.1 


All bases for a finite-dimensional vector space have the same number of vectors. 


To prove this theorem we will need the following preliminary result, whose proof is deferred to the end of the 
section. 


THEOREM 4.5.2 


Let V be a finite-dimensional vector space, and let (v1, v2,..., v4) be any basis. 
(a) lf aset has more than n vectors, then it is linearly dependent. 


(b) If a set has fewer than n vectors, then it does not span V. 


Some writers regard the empty set to be a basis 
for the zero vector space. This is consistent with 
our definition of dimension, since the empty set 
has no vectors and the zero vector space has 
dimension zero. 


We can now see rather easily why Theorem 4.5.1 is true; for if 
5 = (wi, V2; cani Vy} 
is an arbitrary basis for V, then the linear independence of S implies that any set in V with more than л vectors 


is linearly dependent and any set in V with fewer than vectors does not span V. Thus, unless a set in V has 
exactly п vectors it cannot be a basis. 


We noted in the introduction to this section that for certain familiar vector spaces the intuitive notion of 
dimension coincides with the number of vectors in a basis. The following definition makes this idea precise. 


Engineers often use the term degrees of 
freedom as a synonym for dimension. 


DEFINITION 1 


The dimension of a finite-dimensional vector space V is denoted by dim(7/) and is defined to be the 
number of vectors in a basis for V. In addition, the zero vector space is defined to have dimension zero. 


EXAMPLE 1 Dimensions of Some Familiar Vector Spaces < 


dim (R^) = The standard basis has z vectors. 


dim(P,) —»--1 The standard basis has z + 1 vectors. 


dim( M) = әми The standard basis has zzz vectors. 


EXAMPLE 2 Dimension of Span(S) <4 


If S = (vi, v2,..., V.) isa linearly independent set in a vector space V, then S is automatically 
a basis for span(S) (why?), and this implies that 


dim[span(S)] = ғ 


In words, ће dimension of ће space spanned by a linearly independent set of vectors is equal to 
the number of vectors in that set. 


EXAMPLE 3 Dimension of a Solution Space  -* 


Find a basis for and the dimension of the solution space of the homogeneous system 
2X1 + 2x3 = х3 Ex5-0 
—X,—2Z2-4-2x3— 3x44 x5 = 0 
x1 + X2 = 2x3 —xs-—Ü 
xX34-X44-x5-—0 


Solution We leave it for you to solve this system by Gauss-Jordan elimination and show that 
its general solution is 


X|— —s—Íí, X2=5, X3— —í, x4—0, х= 


which can be written in vector form as 
(х1, X2, X3, X4, X5) = (=s = £, 5, = É, 0, £) 
or, alternatively, as 
(x1, X2, 33, X4 35) =s — 1, 1, 0, 0, 0) --£(— 1,0, = 1,0, 1) 


This shows that the vectors v; = ( — 1, 1, 0, 0, 0) and v; = ( — 1, 0, — 1, 0, 1) span the 
solution space. Since neither vector is a scalar multiple of the other, they are linearly independent 
and hence form a basis for the solution space. Thus, the solution space has dimension 2. 


EXAMPLE 4 Dimension of a Solution Space <4 


Find a basis for and the dimension of the solution space of the homogeneous system 


x1 3x3 = 2x3 + 2x5 = 0 

2x1 + 6x3 = 5х3 = 2x4 + 4х5 = 3x6 = 0 
5х3 + 10х4 + 15х6 = 0 

2x1 + 6х2 Ь 8x4+4x5 + 18х6 = 0 


Solution In Example 6 of Section 1.2 we found the solution of this system to be 
x= —3r—4s—2t, xj—r, x3— = 0206, хд=5, x5=f, х= 0 
which can be written in vector form as 
(X1, X2, X3, X4, X5, X68) = ( — br — 4s — 2t, r, — 25, 5, £, 0) 
or, alternatively, as 
(x1, X2, X3, X4, X5) =r —5,1,0,0,0, 0) --5( —4, 0, —2, 1, 0, 0) --£( — 2, 0, 0, 0, 1, 0) 
This shows that the vectors 
vi—(—323,1,0,0,0,0, v52—(—4,0, =2, 1,0,0), v3—(-2,0,0,0,1,0) 


span the solution space. We leave it for you to check that these vectors are linearly independent 
by showing that none of them is a linear combination of the other two (but see the remark that 
follows). Thus, the solution space has dimension 3. 


Remark It can be shown that for a homogeneous linear system, the method of the last example always 
produces a basis for the solution space of the system. We omit the formal proof. 


Some Fundamental Theorems 


We will devote the remainder of this section to a series of theorems that reveal the subtle interrelationships 
among the concepts of linear independence, basis, and dimension. These theorems are not simply exercises in 
mathematical theory—they are essential to the understanding of vector spaces and the applications that build 
on them. 


We will start with a theorem (proved at the end of this section) that is concerned with the effect on linear 
independence and spanning if a vector is added to or removed from a given nonempty set of vectors. 
Informally stated, if you start with a linearly independent set S and adjoin to it a vector that is not a linear 
combination of those in S, then the enlarged set will still be linearly independent. Also, if you start with a set S 
of two or more vectors in which one of the vectors is a linear combination of the others, then that vector can be 
removed from S without affecting span(S) (Figure 4.5.1). 


The vector outside the plane Any of the vectors can Either of the collinear 

can be adjoined to the other be removed, and the vectors can be removed, 

two without affecting their remaining two will still and the remaining two 

linear independence. span the plane. will still span the plane. 
Figure 4.5.1 


THEOREM 4.5.3 Plus/Minus Theorem 


Let S be a nonempty set of vectors in a vector space Ё. 
(a) If S is a linearly independent set, and if v is a vector in / that is outside of span( S), then the set 
SU (v) that results by inserting v into 5 is still linearly independent. 


(b) If v is a vector in S that is expressible as a linear combination of other vectors in S, and if 5 — (vj 
denotes the set obtained by removing v from S, then 5 — (vj span the same space; that is, 


span(S) = span(S — (vj ) 


EXAMPLE 5 Applying the Plus/Minus Theorem + 
Show that p, = 1 — x, рр=2— x^, and рз = x? аге linearly independent vectors. 


Solution Theset 5 = (pi, p2} is linearly independent, since neither vector in S is a scalar 
multiple of the other. Since the vector P3 cannot be expressed as a linear combination of the 
vectors in S (why?), it can be adjoined to S to produce a linearly independent set 


S" = {p1, P2 рз). 


In general, to show that a set of vectors (v4, v2,..., Vp} is a basis for a vector space V, we must show that the 
vectors are linearly independent and span V. However, if we happen to know that V has dimension n (so that 
(v1, V2, -- Vyp} contains the right number of vectors for a basis), then it suffices to check either linear 


independence or spanning— the remaining condition will hold automatically. This is the content of the 
following theorem. 


THEOREM 4.5.4 


Let V be an n-dimensional vector space, and let S be a set in V with exactly n vectors. Then S is a basis 
for V if and only if 5 spans V or S is linearly independent. 


Proof Assume that S has exactly п vectors and spans V. To prove that S is a basis, we must show that S isa 
linearly independent set. But if this is not so, then some vector v in S is a linear combination of the remaining 
vectors. If we remove this vector from S, then it follows from Theorem 4.5.35 that the remaining set of y = 1 
vectors still spans V. But this is impossible, since it follows from Theorem 4.5.25 that no set with fewer than n 
vectors can span an n-dimensional vector space. Thus S is linearly independent. 


Assume that S has exactly п vectors and is a linearly independent set. To prove that S is a basis, we must show 
that S spans V. But if this is not so, then there is some vector v in V that is not in span( S). If we insert this 
vector into S, then it follows from Theorem 4.5.3a that this set of x + 1 vectors is still linearly independent. 
But this is impossible, since Theorem 4.5.2a states that no set with more than л vectors in an n-dimensional 
vector space can be linearly independent. Thus S spans V. 


EXAMPLE 6 Bases by Inspection < 


(a) By inspection, explain why v4 = ( — 3, 7) and v2 = (5, 5) form a basis for 22. 


(b) By inspection, explain why vj = (2, 0, = 1), vz = (4, 0, D, and v3 = ( = 1, 1, 4) forma 
basis for 23. 


Solution 


(2) Since neither vector is a scalar multiple of the other, the two vectors form a linearly 
independent set in the two-dimensional space 22, and hence they form a basis by Theorem 
4.5.4. 

(b) The vectors V; and уз form a linearly independent set in the xz-plane (why?). The vector v3 
is outside of the xz-plane, so the set (v4, уз, v3) is also linearly independent. Since 22 is 


three-dimensional, Theorem 4.5.4 implies that (v4, v2, v3) is a basis for RÊ. 


The next theorem (whose proof is deferred to the end of this section) reveals two important facts about the 
vectors in a finite-dimensional vector space V: 


1. Every spanning set for a subspace is either a basis for that subspace or has a basis as a subset. 


2. Every linearly independent set in a subspace is either a basis for that subspace or can be extended to a basis 
for it. 


ТНЕОКЕМ 4.5.5 


Let S be a finite set of vectors in a finite-dimensional vector space V. 


(a) If S spans V but is not a basis for V, then S can be reduced to a basis for V by removing appropriate 
vectors from S. 


(b) If Sis a linearly independent set that is not already a basis for V, then S can be enlarged to a basis 
for V by inserting appropriate vectors into 5. 


We conclude this section with a theorem that relates the dimension of a vector space to the dimensions of its 
subspaces. 


THEOREM 4.5.6 


If W is a subspace of a finite-dimensional vector space V, then: 
(a) W is finite-dimensional. 

(b) dim(W) < dim(7). 

(с) W =F if and only if dim(W) = dim(V). 


Proof (a) We will leave the proof of this part for the exercises. 


Proof (b) Part (a) shows that W is finite-dimensional, so it has a basis 


S= (wy , W2,..., Wm} 


Either S is also a basis for V or it is not. If so, then Зит) = #, which means that дит) = dim(W). Ifnot, 
then because S is a linearly independent set it can be enlarged to a basis for V by part (b) of Theorem 4.5.5. But 
this implies that dim (IF) < dim(V/), so we have shown that dim (IF) < аит) in all cases. 


Proof (c) Assume that dim(W) = бит) and that 


5 = (wi, W2, o Wm} 


is a basis for W. If S is not also a basis for V, then being linearly independent S can be extended to a basis for V 
by part (b) of Theorem 4.5.5. But this would mean that Зит) > dim(W), which contradicts our hypothesis. 
Thus S must also be a basis for V, which means that ат) = бт). 


Figure 4.5.2 illustrates the geometric relationship between the subspaces of 53 in order of increasing 


dimension. 


Line through the origin 
(1-dimensional) 


Plane through 
the origin 
(2-dimensional) 


Figure 4.5.2 


OPTIONAL 


We conclude this section with optional proofs of Theorem 4.5.2, Theorem 4.5.3, and Theorem 4.5.5. 


Proof of Theorem 4.5.2(a) Let S' = fwi, №2, ..., Wm } be any set of m vectors іп V, where jy; = у. We 


want to show that S’ is linearly dependent. Since S= (v4, V3, ..., Vy} isa basis, each Wy can be expressed as a 
linear combination of the vectors in S, say 


W|-—dG11V1-d-221V2 + б * * Hay Vj 


W2 = @12¥1 + 42292  * ^ 7 Fann (1) 
Wm = 21mV1 FAm E * * * + Gum Vn 
To show that S' is linearly dependent, we must find scalars £1, £2, ..., у, not all zero, such that 
kiwi + kawa + 55 + KW = 0 (2) 


Using the equations in 1, we can rewrite 2 as 
(kiaj + k2a12 + ^ ^ + атт) 
*(k1a21 + #824224 + * + + Ema) v2 


+ (ау 4 ау? + 5 + Km nm) Và = 0 
Thus, from the linear independence of S, the problem of proving that S' is a linearly dependent set reduces to 
showing there are scalars #1, &5, ..., бу not all zero, that satisfy 
1141 Haik t Фар = 0 


4211 + ank + ws | + алт = 0 6) 


anik Fank + * аут = 0 


But 3 has more unknowns than equations, so the proof is complete since Theorem 1.2.2 guarantees the 
existence of nontrivial solutions. 


Proof of Theorem 4.5.2(b) Let S '= fwi, W2, ---, Wm } be any set of m vectors in V, where ру <= м. We 
want to show that S’ does not span V. We will do this by showing that the assumption that S’ spans V leads to a 
contradiction of the linear independence of (v, v5, ..., v4) . If S' spans V, then every vector in V is a linear 
combination of the vectors in S'. In particular, each basis vector V; is a linear combination of the vectors in $", 


зау 


у =411W1 d-221W2 ++ * * * Famm 
v = 412W| Farw H * * * Famm (4) 
Vy = 1091] + d24W2 E cocco E Gay Wm 
To obtain our contradiction, we will show that there are scalars #1, £3, ..., бу, not all zero, such that 
үу] | kava „ШШ... учу = 0 (5) 


But 4 and 5 have the same form as 1 and 2 except that т and n are interchanged and the w's апа v's are 
interchanged. Thus, the computations that led to 3 now yield 


ajiki Haik + * Фау = 0 
anki + 22202 + * * * +agyk,=0 


This linear system has more unknowns than equations and hence has nontrivial solutions by Theorem 1.2.2. 


Proof of Theorem 4.5.3(a) Assume that S = (v4, v3,..., vy} is a linearly independent set of vectors in V, 
and v is a vector in V outside of span(S). To show that S’ = ivi. v2, ..., Vy, vi is a linearly independent set, 


we must show that the only scalars that satisfy 


kivi + оу Б b Ev, H Куу =0 (6) 


are р = k3 = ++ = kp = Жуу = 0. But it must be true that &, , ; = 0 for otherwise we could solve 6 for v 
as a linear combination of v, уз, ..., Vy, contradicting the assumption that v is outside of span( 5). Thus, 6 
simplifies to 


буу Heva t - ++ +k,v,=0 (7) 


which, by the linear independence of {¥1, v5, ..., Vy} , implies that 
ki = = " e e —k,—0 


Proof Theorem 4.5.3(b) Assume that S = {¥1, уз, .... уу} is a set of vectors in У, and (to be specific) 
suppose that V» is a linear combination of v4, уз, ..., Vy—1, Say 


V, = СУ c C2V2-b ^ ^ c 519—1 (8) 


We want to show that if V» is removed from S, then the remaining set of vectors (v4, v2,..., v, .,) still spans 
S; that is, we must show that every vector w in span(.S) is expressible as a linear combination of 
(v1, V2, ..., V» 1) . But if w is in span( S), then w is expressible in the form 

= Дуу Hiva t c ру] Kv, 


Or, on substituting 8, 


= Дуу Hiwat sts tk, qvi Aple Hewat Суру р) 


which expresses w as a linear combination of v1, v2, ..., У]. 


Proof of Theorem 4.5.5(a) If S 15 а set of vectors that spans V but is not a basis for V, then S is a linearly 
dependent set. Thus some vector v in S is expressible as a linear combination of the other vectors in S. By the 
Plus/Minus Theorem (4.5.35), we can remove v from S, and the resulting set S’ will still span V. If S' is linearly 
independent, then S” is a basis for V, and we are done. If S' is linearly dependent, then we can remove some 
appropriate vector from S’ to produce a set 5” that still spans V. We can continue removing vectors in this way 
until we finally arrive at a set of vectors in S that is linearly independent and spans V. This subset of S is a basis 
for V. 


Proof of Theorem 4.5.5(b) Suppose that dim(I/) = м. If S is a linearly independent set that is not already а 
basis for V, then S fails to span V, so there is some vector v in V that is not in span( 5). By the Plus/Minus 
Theorem (4.5.3a), we can insert v into S, and the resulting set S’ will still be linearly independent. If S’ spans V, 
then $ is a basis for V, and we are finished. If S’ does not span V, then we can insert an appropriate vector into 
$' to produce a set S” that is still linearly independent. We can continue inserting vectors in this way until we 
reach a set with n linearly independent vectors in V. This set will be a basis for V by Theorem 4.5.4. 


Concept Review 
* Dimension 


* Relationships among the concepts of linear independence, basis, and dimension 


Skills 
* Find a basis for and the dimension of the solution space of a homogeneous linear system. 
* Use dimension to determine whether a set of vectors is a basis for a finite-dimensional vector space. 


* Extend a linearly independent set to a basis. 


Exercise Set 4.5 


In Exercises 1—6, find a basis for the solution space of the homogeneous linear system, and find the 
dimension of that space. 


1. x|4xX2— x3=0 
= 2x, —23-4-2x3-—0 
=x] H x3—0 


Answer: 


Basis: (1, 0, 1); dimension = 1 


2.3х|++х2+х3++х4=0 
5x|—x24x3—x4-—0 

3. х= 4х2 + 3х3 = x4=0 
2x1 = 8x2 + 6х3 = 2х4 = 0 


Answer: 


Basis: (4, 1, 0, 0), (—35, 0, 1, 0), (1, 0, 0, 1); dimension = 3 
4. ху—3х2++ x3—0 
2х1 = 6х2 + 2х3 = 0 
3x1 = 9х2 + 3х3 = 0 
5. 2x1 + х2 + 3х3 = 0 
X1 + 5х3 = 0 
х2+ х3 = 0 
Answer: 
No basis; dimension = 0 
6. x+ y+ z-0 
Зх + 2y —2z—0 
4х + Зу = z=0 
бх + 5у + 2= 0 
7. Find bases for the following subspaces of 23. 
(a) The plane 3x — 2y + 5z = 0. 
(b) The plane x — у = 0. 
(c) The line x = 2, y = —t,z = 4t- 
(d) All vectors of the form (a, Ё, c), where b = a +c. 


Answer: 


@ [2 1,0), [22.0.1 
г НЧ р E 
(b) (1, 1, 0), (0, 0, 1) 
(с) (2, = 1,4) 
(d) (1, 1, 0), (0, 1, 1) 
8. Find the dimensions of the following subspaces of 24. 
(a) All vectors of the form (a, 5, c, 0). 
(b) All vectors of the form (a, 5, c, d), where d = a + b and ç = g — b. 
(c) All vectors ofthe form (a, b, с, d), where g =b — c =d. 
9. Find the dimension of each ofthe following vector spaces. 


(a) The vector space of all diagonal » x з matrices. 


10. 


п. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


(b) The vector space of all symmetric з x з matrices. 


(c) The vector space of all upper triangular » x »; matrices. 


Answer: 


(a) ” 
(b) и(и + 1) 
2 


(с) (+1) 
2 


Find the dimension of the subspace of Рз consisting of all polynomials go + дух 4 азх? | азх? for which 
ag = 0. 

(a) Show that the set W of all polynomials in P5 such that p(1) = 0 is a subspace of P5. 

(b) Make a conjecture about the dimension of W. 


(c) Confirm your conjecture by finding a basis for W. 

Find a standard basis vector for 23 that can be added to the set (v, уз) to produce a basis for 23. 
(а) vV17(—1,2, 3), v2= (1, —2, —2) 

(b) v1 = (1, 21,0), жз = (3, 1, 22) 


Find standard basis vectors for 24 that can be added to the set (v4, уз) to produce a basis for 24. 
үр= (1, —4,2, = 3), v= (= 3,8, —4,6) 


Answer: 


Any two of (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1) can be used. 


Let (v4, v3, v3) bea basis for a vector space V. Show that (uj, uz, uz} is also a basis, where Uj = v1, 
uz = Vj + V2, and уз = V, + V2 + V3. 


The vectors v; = (1, — 2, 3) and vz = (0, 5, — 3) are linearly independent. Enlarge (v4, v3) toa basis 
for д2. 


Answer: 


The vectors v; = (1, —2, 3, — 5) and v; = (0, — 1, 2, — 3) are linearly independent. Enlarge 

(v1, v2) to a basis for д4. 

(a) Show that for every positive integer n, one can find x + 1 linearly independent vectors in F ( — со, со) 
. [Hint: Look for polynomials.] 

(b) Use the result inpart (a) to prove that F ( — со, oo) is infinite- dimensional. 

(c) Prove that C ( — oo, oo), eT (—со, со), and C ^ = co, со) are infinite-dimensional vector spaces. 

Let S be a basis for an n-dimensional vector space V. Show that if v4, v5, ..., v, form a linearly 


independent set of vectors in V, then the coordinate vectors (v1) s, (v2) ду --- (Wy) 5 form a linearly 
independent set in R”, and conversely. 


19. Using the notation from Exercise 18, show that if the vectors v4, v5, ..., v, span V, then the coordinate 
vectors (v1) с, (V2) s -- (v) s span А”, and conversely. 


20. Find a basis for the subspace of P} spanned by the given vectors. 
(а) —1--x —2x2.3 3х + 6x2, 9 
(b) 1-- x. х2, —2 + 2x7, —3x 
(с) 14x — 3x7, 2 + 2x — 6x7, 3+ 3x — 9x? 


[Hint: Let S be the standard basis for P5, and work with the coordinate vectors relative to S as in Exercises 
18 and 19.] 


21. Prove: A subspace of a finite-dimensional vector space is finite-dimensional. 


22. State the two parts of Theorem 4.5.2 in contrapositive form. 
True-False Exercises 
In parts (a)-(]) determine whether the statement is true or false, and justify your answer. 
(a) The zero vector space has dimension zero. 
Answer: 


True 


(b) There is a set of 17 linearly independent vectors in 217. 


Answer: 


True 


(c) There is a set of 11 vectors that span 217. 


Answer: 


False 


(d) Every linearly independent set of five vectors in 27 is a basis for 27. 


Answer: 


True 


(e) Every set of five vectors that spans 22 is a basis for RŽ. 


Answer: 


True 


(f) Every set of vectors that spans R” contains a basis for R”. 
Answer: 


True 


(g) Every linearly independent set of vectors in R” is contained in some basis for R”. 
Answer: 


True 


(h) There is a basis for M 55 consisting of invertible matrices. 
Answer: 
True 
@ If A has size » худ and iy, A, А?, А5 ди? are distinct matrices, then {im A, At P an | is linearly 
dependent. 
Answer: 


True 


(j) There are at least two distinct three-dimensional subspaces of Рз. 
Answer: 


False 
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4.6 Change of Basis 


A basis that is suitable for one problem may not be suitable for another, so it is a common process in the study 
of vector spaces to change from one basis to another. Because a basis is the vector space generalization of a 
coordinate system, changing bases is akin to changing coordinate axes іп 22 and д2. In this section we will 


study problems related to change of basis. 


Coordinate Maps 


If S = (v1, V2,..., Уу} 15 a basis for a finite-dimensional vector space V, and if 
(м) = (c1. C2. saa Cy) 
is the coordinate vector of v relative to S, then, as observed in Section 4.4 , the mapping 


v (v)s (1) 


creates a connection (a one-to-one correspondence) between vectors in the general vector space V and vectors 
in the familiar vector space R”. We call 1 the coordinate map from V to R”. In this section we will find it 
convenient to express coordinate vectors in the matrix form 

e| 


[w]s— |. Q) 


where the square brackets emphasize the matrix notation (Figure 4.6.1). 


Coordinate map 


Figure 4.6.1 


Change of Basis 


There are many applications in which it is necessary to work with more than one coordinate system. In such 
cases it becomes important to know how the coordinates of a fixed vector relative to each coordinate system 
are related. This leads to the following problem. 


The Change-of-Basis Problem 


If v is a vector in a finite-dimensional vector space V, and if we change the basis for V from a basis B 
to a basis В’, how are the coordinate vectors [v] p and [v] pr? 


Remark То solve this problem, it will be convenient to refer to B as the “old basis" and В' as the “new 
basis." Thus, our objective is to find a relationship between the old and new coordinates of a fixed vector v in 
Vy. 


For simplicity, we will solve this problem for two-dimensional spaces. The solution for n-dimensional spaces 
is similar. Let 


B= fuj, uz} and B'— fuj, ul 


be the old and new bases, respectively. We will need the coordinate vectors for the new basis vectors relative 
to the old basis. Suppose they are 


м2 [8] = [)e-[;] © 
That is, 


uj = qu, + èuz 


4 
ч, = cu, + duz » 
Now let v be any vector in V, and let 
ky 
[v] в' = > (5) 
be the new coordinate vector, so that 
ү = kuj | kw, (6) 


In order to find the old coordinates of v, we must express v in terms of the old basis B. To do this, we 
substitute 4 into 6. This yields 


у= (ам + uj) + (сщ + du) 
Or 
v = (ka + kgc)u, + (kib + за уш 
Thus, the old coordinate vector for v is 
kia + kac 
lom [cac] 


which, by using 5, can be written as 


мв [8 4а (5 2» 


This equation states that the old coordinate vector [v] р results when we multiply the new coordinate vector 
[v] g" on the left by the matrix 
а с 
#14] 


Since the columns of this matrix аге the coordinates of the new basis vectors relative to the old basis [see 3] 
we have the following solution of the change-of-basis problem. 


Solution of the Change-of-Basis Problem 


If we change the basis for a vector space V from an old basis 8 = {uy, 03, -.., u4) to anew basis 
' roe r | | ; 
В = {uy , U5, -- Uy i, then for each vector v in V, the old coordinate vector [v] p is related to the 


new coordinate vector [v] p by the equation 
[v] — P[v] p (7) 


where the columns of P are the coordinate vectors of the new basis vectors relative to the old basis; 
that is, the column vectors of P are 


[1] [m] [®]в (8) 


Transition Matrices 


The matrix P in Equation 7 is called the transition matrix from B' to B. For emphasis, we will often denote it 
by Pg: ,gIt follows from 8 that this matrix can be expressed in terms of its column vectors as 


r , r 
Pg'og— || р" ^ ^ и (9) 
B B B 
Similarly, the transition matrix from В to B' can be expressed in terms of its column vectors as 


Pg ,gr— [Dui] g'|[u2] e| * * [[un] g*] (10) 


Remark There is a simple way to remember both of these formulas using the terms “old basis” and “new 
basis" defined earlier in this section: In Formula 9 the old basis is 8’ and the new basis is B, whereas in 
Formula 10 the old basis is В and the new basis is 8’. Thus, both formulas can be restated as follows: 


The columns of the transition matrix from an old basis to a new basis are the coordinate vectors of the 
old basis relative to the new basis. 


EXAMPLE 1 Finding Transition Matrices — 


Consider the bases 8 = (uj, uz} and В' = Tuj > ч} for д2, where 

uj = (1, 0). 02 = (0, 1), uj = (1, 1), ч = (2, 1) 
(a) Find the transition matrix Рр’ ,p from B' to B. 
(b) Find the transition matrix Pp ,p' from B to B'. 


Solution 


(2) Here the old basis vectors are uj and ч and the new basis vectors are Uj and Uz. We want 
to find the coordinate matrices of the old basis vectors uj and u; relative to the new basis 
vectors Uj and из. To do this, first we observe that 


uj =u; + u2 
ч, = 2u, + u2 


from which it follows that 


milli] e f] 


12 
zm 1 


(b) Here the old basis vectors are ш and uz and the new basis vectors are uj and ш. As in part 


and hence that 


(a), we want to find the coordinate matrices of the old basis vectors uj and ч relative to 
the new basis vectors ш and U3. To do this, observe that 


uj = — uj + ч, 
uz = 2щ -v 


from which it follows that 


ав 7] m [шле=|_{| 


апа һепсе Шаї 


Suppose now that В and B' are bases for a finite-dimensional vector space V. Since multiplication by Pp: ,p 
maps coordinate vectors relative to the basis B' into coordinate vectors relative to a basis В, and P B ,p'maps 
coordinate vectors relative to B into coordinate vectors relative to 8’, it follows that for every vector v in V 
we have 


[v]g— Pg' ,glv] в (11) 
[v]g' = Pp. ,g'[v]g (12) 


EXAMPLE 2 Computing Coordinate Vectors — 


Let B and 3" be the bases in Example 1. Use an appropriate formula to find [v] p given that 


[v]g' = i 


Solution То find [v] р we need to make the transition from 8" to B. It follows from Formula 
11 and part (a) of Example 1 that 


[v]g— Pg' ,g[v]g' = f irs] = H 


Invertibility of Transition Matrices 


If B and д' are bases for a finite-dimensional vector space V, then 

(Ppg' ,g) Ppp) = Pag 
because multiplication by (Pp: ,g)(Pg ,g') first maps B-coordinates of a vector into B'-coordinates, and 
then maps those B'-coordinates back into the original B-coordinates. Since the net effect of the two operations 


is to leave each coordinate vector unchanged, we are led to conclude that Pp ,p must be the identity matrix, 
that is, 


(Pg! ,g)(Pp.,go =! (13) 
(we omit the formal proof). For example, for the transition matrices obtained in Example 1 we have 


(Рр PR) = |1 Е HEF |=: 


It follows from 13 that Рр’, р is invertible and that its inverse is Pp , p: Thus, we have the following 
theorem. 


THEOREM 4.6.1 


If P is the transition matrix from a basis B' to a basis В for a finite-dimensional vector space V, then P 
is invertible and P 1 is the transition matrix from B to B". 


An Efficient Method for Computing Transition Matrices for R” 


Our next objective is to develop an efficient procedure for computing transition matrices between bases for 
R”. As illustrated in Example 1, the first step in computing a transition matrix is to express each new basis 
vector as a linear combination of the old basis vectors. For 2” this involves solving n linear systems of n 
equations in n unknowns, each of which has the same coefficient matrix (why?). An efficient way to do this is 
by the method illustrated in Example 2 of Section 1.6, which is as follows: 


A Procedure for Computing Рв > в' 


Step 1 Form the matrix L В]. 


Step 2 Use elementary row operations to reduce the matrix in Step 1 to reduced row echelon form. 


Step 3 The resulting matrix will be [7|Pg ,g'] 
Step 4 Extract the matrix Pp p from the right side of the matrix in Step 3. 


This procedure is captured in the following diagram. 


: | row operations l| 
[new basis|old basis] — [Z transition from old to new] (14) 


EXAMPLE 3 Example 1 Revisited — 


In Example 1 we considered the bases 8 = (цу, uz} and B' = fuj’, uz} for 22, where 
uj = (1. 0}, uz = (0, 1), ш’ = (1, 1), u = (2, 1) 
(a) Use Formula 14 to find the transition matrix from Д' to B. 


(b) Use Formula 14 to find the transition matrix from B to 5". 


Solution 


(а) Here 3' is the old basis and В is the new basis, so 


1 0 


basis|old basis] — 
[new asis|o asis | в 


1 2 
1 1 


Since the left side is already the identity matrix, no reduction is needed. We see by 
inspection that the transition matrix is 


12 
222 1 


which agrees with the result in Example 1. 
(b) Here B is the old basis and 8" is the new basis, so 


l2 


[new basis old basis] — [ | 


1 0 
0 1 
By reducing this matrix, so the left side becomes the identity we obtain (verify) 


7 101-1 2 
(Ufanstion Bom аө new] =| o | 1 El 


- 2 
Pg. | 1 2) 


which also agrees with the result in Example 1. 


so the transition matrix is 


Transition to the Standard Basis for R" 


Note that in part (a) of the last example the column vectors of the matrix that made the transition from the 
basis 8" to the standard basis turned out to be the vectors in 8’ written in column form. This illustrates the 


following general result. 


THEOREM 4.6.2 


ГеВ! = fui, er ün} beany basis for the vector space R" and let S= (е, ез,..., e4) be the 
standard basis for R”. If the vectors in these bases are written in column form, then 


Pp'_.g= [wuz]; * * [ux] (15) 


It follows from this theorem that if 
A= [u;fug] - - - jus] 


is any invertible » x »; matrix, then A can be viewed as the transition matrix from the basis {ш}, 03, ..., Чу} 
for R” to the standard basis for R”. Thus, for example, the matrix 


1 2 3 
А=|2 53 
108 
which was shown to be invertible in Example 4 of Section 1.5, is the transition matrix from the basis 
щі = (1, 2, 1), um2(25,0) 103 = (3, 3, 8) 
to the basis 
ei —(1,0,0, e5—(0,1,0, e3= (0, 0, 1) 


Concept Review 
* Coordinate map 
* Change-of-basis problem 


* Transition matrix 


Skills 
* Find coordinate vectors relative to a given basis directly. 
* Find the transition matrix from one basis to another. 


* Use the transition matrix to compute coordinate vectors. 


Exercise Set 4.6 


1. Find the coordinate vector for w relative to the basis 5 = (uj, uz} for g?. 
(a) ш = (1, 0), 02 — (0, 1); w= (3, —7) 
(p) u = (2, 74), u = (3, 8); w= (1, 1) 
(c) ш = (1, 1), 02 = (0, 2); w= (а, b) 


Answer: 

(a) [w] g= M 

(b) = 
[w] 5 = : 


N 


w 


A 


л 


a 
9 wis=| boa 
2 


. Find the coordinate vector for v relative to the basis S= (v4, v2, v3) for R2. 


(a) v= (2, — 1, 3); v = (1, 0, 0), v2 = (2, 2, 0), v3 = (3, 3, 3) 
(b) v= (5, = 12, 3); т = (1,2, 3), уз = (= 4, 5, 6), va = (7, — 8,9) 


. Find the coordinate vector for p relative to the basis S= (pi, рз, рз) for P2. 


(a) р=4- 3х x5 ру —1, pz—x, p3 =x? 


(b) р=2—х++х°;рр=1++х‚,ру=1+х1,рз=х+х2 
Answer: 


(a) 4 
(р) = (4, = 3, 1), [p] = | -3 


(b) 0 
(ps—(02, —1) [Ple=| 2 
—1 

. Find the coordinate vector for A relative to the basis S = (41, 43, Аз, Aq} for M 55. 


20 = 11 
ami) а-о Als oj 


. Consider the coordinate vectors 


6 3 
[м] = | -1|, [9]5= |0 |, [S]s— 
4 


(a) Find w if S is the basis in Exercise 2(a). 
(b) Find q if S is the basis in Exercise 3(a). 
(c) Find B if S is the basis in Exercise 4. 


Answer: 
(a) w= (16, 10, 12) 
(b) q—3-- 4x? 
(c) J= 15 —1 
l 6 3 


- Consider the bases 2 = (uj, uz} and В' = fui , ч} for д2, where 


х «-[) -E «-[3 


(a) Find the transition matrix from 5" to B. 
(b) Find the transition matrix from B to 5". 


(c) Compute the coordinate vector [w] p, where 
„= 3 
-5 


(d) Check your work by computing [w] р’ directly. 


and use 10 to compute [w] р’. 


. Repeat the directions of Exercise 6 with the same vector w but with 


u=[3} е «- B] ЗЕ 


Answer: 
(а) |13 1 
10 
2 
-=$ 0 
(b) B 
б 2 
13 
ES. => 
(с) dT 
10 —4 
mam |а 
5 


- Consider the bases 2 = (uj, uz, uz} and В' = {uj , v. ч} for R3, where 


—3 —3 1 

uj = 0l, w= 2|, w= 
—3 —1 —1 
—6 —2 —2 
uj = —6 |, ч, = —6 |, u = =3 
0 4 7 


(a) Find the transition matrix from B to 5". 


(b) Compute the coordinate vector [w] р, where 


w= 8 


and use 12 to compute [w] р’. 


(c) Check your work by computing [w] р’ directly. 


9. Repeat the directions of Exercise 8 with the same vector w, but with 


2 2 1 
uj—|1|, u;—c|-1/| из=|2 
1 1 1 
3 1 -1 
uj = 1 ч, = 1 |, ч = 
— -3 2 
Answer: 
(a) Э; 
3 2 5 
1 
-2 —3 —5 
5 1 6 
(b) Мег: 
9 2 
[а] p=] -9 |, [м] вр =| 23 
5 2 
6 


10. Consider the bases B= {p1, рз) and 2' = 191, q2} for P, where 
pi=6+3x, p2210--2x, q1—2, qg=3+2x 
(a) Find the transition matrix from 8" to B. 
(b) Find the transition matrix from В to 5". 
(c) Compute the coordinate vector [p] p, where p= — 4 + x, and use 12 to compute [p] р". 
(d) Check your work by computing [p] р' directly. 
11. Let V be the space spanned by f | = sin x and f  — cos x. 


(a) Show that g; = 2sin x + cos x and g3 = 3cos x form a basis for V. 
(b) Find the transition matrix from 8’ = 191, g2} to B = (Е, f2}. 


(c) Find the transition matrix from В to 5". 


(d) Compute the coordinate vector [h] р, where h = 2sin x = 5cos х, and use 12 to obtain [h] р". 


(e) Check your work by computing [h] g’ directly. 


Answer: 


(b) [2 0 
1-3 


12. 


13. 


14. 


(с) 


ы | 


@ m=] i| e| 5| 


Let S be the standard basis for 22, and let B = (v4, з) be the basis in which v, = (2, 1) and 
v2—(—3,4) 

(a) Find the transition matrix Pg ,5 by inspection. 

(b) Use Formula 14 to find the transition matrix Ро в 

(c) Confirm that Pg ‚сапа Ро ,p are inverses of one another. 

(d) Let w= (5, — 3) Find [w] p and then use Formula 11 to compute [w] < 

(e) Let w= (3, — 5) Find [w] s and then use Formula 12 to compute [w] р 


Let S be the standard basis for 25, and let = (v4, уз, уз} be the basis in which v, = (1, 2, 1), 
уз = (2, 5, 0), and v4 = (3, 3, 8). 

(a) Find the transition matrix Pg ,5 by inspection. 

(b) Use Formula 14 to find the transition matrix Ps ‚в. 

(c) Confirm that Pg ‚сапа Ро ,p are inverses of one another. 

(d) Let w= (5, — 2, 1). Find [w] p and then use Formula 11 to compute [w] s. 

(e) Let w — (3, — 5, 0). Find [w] s and then use Formula 12 to compute [w] р. 


Answer: 
а)|1 2 3 
2- 15.75 
10 8 
(b) —40 16 9 
13 -5 —3 
5 —2 =] 
(d) —239 5 
[w] g = 77 |, [w] = | 23 
30 1 
(e) 3 —200 
[w] s=| —5 |, [w] g= 64 
0 25 


Let By = (uj, u2} and 85 = (v4, з) be the bases for д2 in which 
u;—(2,2),u5— (4, —1), vy = (1, 3), and v3 = (= 1, = 1). 
(a) Use Formula 14 to find the transition matrix Рр, р]. 

(b) Use Formula 14 to find the transition matrix Pg, ., p». 

(c) Confirm that Pg, ,g, and Pg, в, are inverses of one another. 


(d) Let w= (5, — 3). Find [w] g] and then use the matrix Pg, p, to compute [w] g, from [w] ву. 
(е) Let w= (3, — 5). Find [w] g, and then use the matrix Pg, g, to compute [w] p, from [w] в... 


15. Let By = (uj, uz} and B5 = (v4, з) be the bases for д2 in which u; = (1, 2), uz = (2, 3), 


16. 


17. 


vi = (1, 3), and vz = (1, 4). 

(a) Use Formula 14 to find the transition matrix P B3—B]- 

(b) Use Formula 14 to find the transition matrix Pg, .., p». 

(c) Confirm that Pg, „в and P B} —B are inverses of one another. 

(d) Let w= (0, 1). Find [w] p, and then use the matrix Pg, —B, to compute [w] g, from [w] gj. 
(е) Let w= (2, 5). Find [w] g, and then use the matrix Pg, ,g, to compute [w] ву from [w] gz. 


Answer: 


(а) | 3 
e = 


Let By = (uj, uz, u3) and B5 = (ж, v2, v3} be ће bases for R? in which uj = ( — 3, 0, — 3), 
u2= (=—3, 2, = 1), цз = (1,6, = 1), v; —(—6, = 6, 0), v2 = ( —2, — 6, 4), and 
v3—(—2, = 3, 7). 


(a) Find the transition matrix Pg, —>B>- 


(b) Let w= ( — 5, 8, — 5). Find [w] g] and then use the transition matrix obtained in part (a) to 
compute [w] g, by matrix multiplication. 


(c) Check the result in part (b) by computing [w] g, directly. 


Follow the directions of Exercise 16 with the same vector w but with uj = (2, 1, 1), uz = (2, = 1, 1), 
u3—(1,2, 1, vj = (3, 1, 25), v9 = (1, 1, = 3), and v3 = (= 1, 0, 2). 


Answer: 


(a) з 29 


18. 


19. 


20. 


21. 


22. 


23. 


9 
[w]g, =| 79 |, [#]в„=| 23 
ET. 2 
6 


Let S= (еу, ез) be the standard basis for 22, and let 8 = (v4, v2) be the basis that results when the 
vectors in S are reflected about the line y — x. 


(a) Find the transition matrix Pp ,. 

(b) Let P= Pg ,sand show that pT — Рен 

Let S= (еу, ез) be the standard basis for 22, and let 8 = (v4, v2) be the basis that results when the 
vectors in S are reflected about the line that makes an angle 9 with the positive x-axis. 

(a) Find the transition matrix Рр, о. 


(b) Let P = Pg ,sand show that PT — р. р. 


Answer: 


(a) | cos 208 sin 26 
sin20  —cos 20 


If 51, 82, and B3 are bases for 22, and if 


3 1 t 2 
Рав, | and Рв, = |3 El 


then Рр» ву = — 


If P is the transition matrix from a basis 5" to a basis B, and О is the transition matrix from B to a basis C, 
what is the transition matrix from B' to C? What is the transition matrix from C to 8"? 


To write the coordinate vector for a vector, it is necessary to specify an order for the vectors in the basis. If 


Р is the transition matrix from a basis 8" to a basis В, what is the effect on P if we reverse the order of 
vectors in В from v4, ..., Vy to vy, ..., v1? What is the effect on P if we reverse the order of vectors in 
both 8! and B? 


Consider the matrix 
110 
Р=|1 02 
Be 
(a) P is the transition matrix from what basis В to the standard basis S= (еј, ез, ез) for 22? 
(b) P is the transition matrix from the standard basis S= (е, ез, ез) to what basis В for 22? 


Answer: 
(a) b= {(1, 1,0), (1, 0, 2), (0, 2, 1)} 


СЕТТЕ) 


24. The matrix 

0 0 
3 2 
l 1 
(1,1,1), (1, 1,0), (1, 0, 00] for 22? 


25. Let B be a basis for д. Prove that the vectors vj, v5, ..., v form a linearly independent set in R" if and 
only if the vectors [v1] p, [v2] p. - 


is the transition matrix from what basis B to the basis { 


[vi] g form a linearly independent set in R”. 


26. Let B be a basis for R". Prove that the vectors v4, v5, ..., v span R" if and only if the vectors 
[vi] g [v2] в, --- [vk] в span А”. 


27. If [w] p =w holds for all vectors w in R”, what can you say about the basis В? 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 
(a) If 8; and 23 are bases for a vector space V, then there exists a transition matrix from д} to 3. 
Answer: 


True 


(b) Transition matrices are invertible. 
Answer: 


True 


(с) If B is a basis for a vector space R”, then Рр ,pis the identity matrix. 
Answer: 


True 


(d) If F B,—+Bz is a diagonal matrix, then each vector in 5 is a scalar multiple of some vector in д]. 
Answer: 


True 


(е) If each vector in E» is a scalar multiple of some vector in 84, then Р B1 B; is a diagonal matrix. 
Answer: 


False 


(f) If A is a square matrix, then A= P g} —B, for some bases B, and 22 for R”. 
Answer: 


False 
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4.7 Row Space, Column Space, and Null Space 


In this section we will study some important vector spaces that are associated with matrices. Our work here will provide 
us with a deeper understanding of the relationships between the solutions of a linear system and properties of its 
coefficient matrix. 


Row Space, Column Space, and Null Space 


Recall that vectors can be written in comma-delimited form or in matrix form as either row vectors or column vectors. 
In this section we will use the latter two. 


DEFINITION 1 


For ап; x з matrix 


а 412 --- |n 
A= 221 82 Sus 82и 
йт] Cm2 --- Cmn 
the vectors 
rn = [en 212 .. аи] 
r = [en 222 .. ам] 
Tm = [am am --- amn] 


in R” that are formed from the rows of А are called the row vectors of A, and the vectors 


411 412 41у 

421 422 42у 
с = : p 02 = pope Си = x 

Gy] m2 Cmn 


in R?" formed from the columns of A are called the column vectors of A. 


EXAMPLE 1 Row and Column Vectors of a2 x З Matrix < 


2- 11 
ШЕ =] | 


rn-[2 1 0]andrz=[3 -1 4] 


= se = 9-0 


Let 


The row vectors of A are 


and the column vectors of A are 


The following definition defines three important vector spaces associated with a matrix. 


DEFINITION 2 


If A is апу x matrix, then the subspace of R” spanned by the row vectors of A is called the row space of A, 
and the subspace of 2?” spanned by the column vectors of A is called the column space of A. The solution space 
of the homogeneous system of equations Ах = 0, which is a subspace of R”, is called the null space of A. 


In this section and the next we will be concerned with two general questions: 


Question 1. What relationships exist among the solutions of a linear system 4x — h and the row space, column space, 
and null space of the coefficient matrix A? 


Question 2. What relationships exist among the row space, column space, and null space of a matrix? 


Starting with the first question, suppose that 


211 d12 ... |n X1 
a a — «| X 
LE ам) „ы | 
Cm| Am2 --- Cmn Xn 


It follows from Formula 10 of Section 1.3 that if c1, €2, ..., Cy denote the column vectors of A, then the product 4x can 
be expressed as a linear combination of these vectors with coefficients from x; that is, 


Ax = X10] + X262 +... хусу (1) 
Thus, a linear system, 4x = b, of m equations in n unknowns can be written as 
X161 + X2€2 +... хусу = b (2) 


from which we conclude that Ах = } is consistent if and only if h is expressible as а linear combination of the column 
vectors of A. This yields the following theorem. 


THEOREM 4.7.1 


A system of linear equations 4x = } is consistent if and only if b is in the column space of A. 


EXAMPLE 2 A Vector p in the Column Space of A «4 


Let Ах = p be the linear system 


-1 3 2|[*1 1 
12 =3 || х2 |=| -9 
2 1 —2||]^3 -3 
Show that h is in the column space of А by expressing it as a linear combination of the column vectors of 
A. 


Solution Solving the system by Gaussian elimination yields (verify) 


x1—2, x2— =], x3=3 
It follows from this and Formula 2 that 
—1 3 2 1 
2| 1|-|2|4*3| -3|2|-8 
2 1 —2 =3 


Recall from Theorem 3.4.4 that the general solution of a consistent linear system Ах = h can be obtained by adding any 
specific solution of this system to the general solution of the corresponding homogeneous system 4x — (). Keeping in 
mind that the null space of A is the same as the solution space of 4x = 0, we can rephrase that theorem in the following 
vector form. 


THEOREM 4.7.2 


If хо is any solution of a consistent linear system Ax = h, and if S= (v4, v2,..., vk) is a basis for the null 
space of A, then every solution of Ах — h can be expressed in the form 


X = X0 CLV, -F C2V2 +... CEVA, (3) 


Conversely, for all choices of scalars c1, c2, ..., Ck, the vector x in this formula is a solution of 4x = bh. 


Equation 3 gives a formula for the general solution of Ах = h. The vector Xy in that formula is called a particular 
solution of Ах = h, and the remaining part of the formula is called the general solution of Ax — 0). In words, this 
formula tells us that. 


The general solution of a consistent linear system can be expressed as the sum of a particular solution of that system 
and the general solution of the corresponding homogeneous system. 


Geometrically, the solution set of Ах = Ъ can be viewed as the translation by хо of the solution space of 4x = 0 (Figure 
4.7.1). 


Solution set 
of Ax = b 


Solution space 
of Ax 0 


Figure 4.7.1 


EXAMPLE 3 General Solution of a Linear System Ax=b <4 


In the concluding subsection of Section 3.4 we compared solutions of the linear systems 


^1 д] 
13-2 02 0 || х2 0 13-2 02 072 
26 —5 —2 4 1*3]. pn and 2 6 -5 —2 4 EIER]. p 
00 5 100 15 [74 0 00 5 100 15 [х4 
26 0 84 18||*5 0 26 0 84 18||*5 6 
X6 X6 


and deduced that the general solution x of the nonhomogeneous system and the general solution Xj; of the 
corresponding homogeneous system (when written in column-vector form) are related by 


x4 —3r — 4s —2{ 0 ER. sd =ĵ 
x2 С 0 1 0 0 
х3 72s s as 01-21 9 
= S =|0 r +s - 

X4 ; 0 0 1 0 
х5 1 | 0 0 1 
^6 = 3 0 0 0 
x = х), 


Recall from the Remark following Example 4 of Section 4.5 that the vectors in Xj form a basis for the solution space of 


Ax = 0. 


Bases for Row Spaces, Column Spaces, and Null Spaces 


We first developed elementary row operations for the purpose of solving linear systems, and we know from that work 
that performing an elementary row operation on an augmented matrix does not change the solution set of the 
corresponding linear system. It follows that applying an elementary row operation to a matrix A does not change the 
solution set of the corresponding linear system 4x = Q, or, stated another way, it does not change the null space of A. 
Thus we have the following theorem. 


THEOREM 4.7.3 


Elementary row operations do not change the null space of a matrix. 


The following theorem, whose proof is left as an exercise, 1s a companion to Theorem 4.7.3. 


THEOREM 4.7.4 


Elementary row operations do not change the row space of a matrix. 


Theorems 4.7.3 and 4.7.4 might tempt you into incorrectly believing that elementary row operations do not change the 
column space of a matrix. To see why this 1s not true, compare the matrices 


13 1 3 
A= and B— 
The matrix В can be obtained from А by adding —2 times the first row to the second. However, this operation has 
changed the column space of А, since that column space consists of all scalar multiples of 


H 


whereas the column space of B consists of all scalar multiples of 


H 


EXAMPLE 4 Finding a Basis for the Null Space of a Matrix — 


and the two are different spaces. 


Find a basis for the null space of the matrix 


35-2 02 0 
6 —5 —2 4 -3 
0 5 100 15 
6 0 8 4 18 


А= 


м © n н 


Solution The null space of A is the solution space of the homogeneous linear system 4x — (), which, as 
shown in Example 3, has the basis 


| 
“з 

| 
fs 

| 
кю 


1 0 0 
0 —2 0 
"= gp S| {| =| ( 
0 0 1 
0 0 0 


Remark Observe that the basis vectors V1, V2, and V3 in the last example are the vectors that result by successively 
setting one of the parameters in the general solution equal to 1 and the others equal to 0. 


The following theorem makes it possible to find bases for the row and column spaces of a matrix in row echelon form 


by inspection. 


THEOREM 4.7.5 


If a matrix R is in row echelon form, then the row vectors with the leading 1's (the nonzero row vectors) form a 
basis for the row space of R, and the column vectors with the leading 1' of the row vectors form a basis for the 
column space of R. 


The proof involves little more than an analysis of the positions of the O's and 1's of R. We omit the details. 


EXAMPLE 5 Bases for Row and Column Spaces «4 


The matrix 
1 2510.3 
0 1300 
zi P 0010 
0 0000 


is in row echelon form. From Theorem 4.7.5, the vectors 
rn =[1 -2 5 0 3] 
r —[0 1 3 0 0] 
тз =(0 00 1 0] 


form a basis for the row space of R, and the vectors 


or 00 


form a basis for the column space of R. 


EXAMPLE 6 Basis for a Row Space by Row Reduction <4 


Find a basis for the row space of the matrix 


ej X o4 3 x5 25 


Solution Since elementary row operations do not change the row space of a matrix, we can find a basis 
for the row space of A by finding a basis for the row space of any row echelon form of А. Reducing А to 
row echelon form, we obtain (verify) 


-3 4-2 5 а 
0 1 3 22 eb 
00 0 1 5 
о оо 0 0 0 


о о н 


By Theorem 4.7.5, the nonzero row vectors of А form a basis for Ше row space of А and hence form a 
basis for the row space of A. These basis vectors are 


n = [1 -34 -2 5 4] 
r = [0 01 3 <2 —6] 
r = [0 00 0 1 5] 


The problem of finding a basis for the column space of a matrix А in Example 6 is complicated by the fact that an 
elementary row operation can alter its column space. However, the good news is that elementary row operations do not 
alter dependence relationships among the column vectors. To make this more precise, suppose that w41, w3, ..., wj, are 
linearly dependent column vectors of A, so there are scalars c, c2, ..., c that are not all zero and such that 


сүзгү + C2W2 +... См — 0 (4) 


If we perform an elementary row operation on A, then these vectors will be changed into new column vectors 

W, W, w. At first glance it would seem possible that the transformed vectors might be linearly independent. 
However, this is not so, since it can be proved that these new column vectors will be linear dependent and, in fact, 
related by an equation 


cw, Hew +... ср, = 0 


that has exactly the same coefficients as 4. It follows from the fact that elementary row operations are reversible that 
they also preserve linear independence among column vectors (why?). The following theorem summarizes all of these 
results. 


THEOREM 4.7.6 


If A and B are row equivalent matrices, then: 


(a) A given set of column vectors of A is linearly independent if and only if the corresponding column vectors 
of B are linearly independent. 


(b) A given set of column vectors of А forms a basis for the column space of A if and only if the corresponding 
column vectors of B form a basis for the column space of В. 


EXAMPLE 7 Basis for a Column Space by Row Reduction << 


Find a basis for the column space of the matrix 


-1 3 -4 2 -5 —4 


Solution We observed in Example 6 that the matrix 
1-3 4-2 5 4 
0 01 3 -2 -6 
0 00 0 1] 5 
о 00 0 0 0 


R= 


is a row echelon form of A. Keeping in mind that A and R can have different column spaces, we cannot 
find a basis for the column space of А directly from the column vectors of R. However, it follows from 
Theorem 4.7.6b that if we can find a set of column vectors of R that forms a basis for the column space of 
R, then the corresponding column vectors of A will form a basis for the column space of A. 


Since the first, third, and fifth columns of R contain the leading 1's of the row vectors, the vectors 


1 4 5 
, |0 | |1 | |-2 
С = 0 И C3 = 0 " сє = 1 
0 0 0 
form a basis for the column space of R. Thus, the corresponding column vectors of A, which are 
1 4 5 
үн 2 TEE 9 — 8 
1 2 Ld 3 9 ? 5 9 
—1 —4 —5 


form a basis for the column space of A. 


Up to now we have focused on methods for finding bases associated with matrices. Those methods can readily be 
adapted to the more general problem of finding a basis for the space spanned by a set of vectors in R”. 


EXAMPLE 8 Basis for a Vector Space Using Row Operations — 


Find a basis for the subspace of 22 spanned by the vectors 
vi = (1, C2,0,0,3), ж = (2, – 5, =3, – 2, 6), 
уз = (0,5,15,10,0, v4 = (2, 6, 18, 8, 6) 


Solution The space spanned by these vectors is the row space of the matrix 


1-2 0 03 
2 —-5 —-3 —2 6 
0 5 15 10 0 
2 6 18 86 


Reducing this matrix to row echelon form, we obtain 


1-200353 
0 1320 
0 0110 
0 0000 


The nonzero row vectors in this matrix are 

w;—í(1, = 2, 0, 0, 3), зә = (0, 1, 3, 2, 0), w3—(0,0,1,1, 0) 
These vectors form a basis for the row space and consequently form a basis for the subspace of g? 
spanned by V, V2, V3, and V4. 


Bases Formed from Row and Column Vectors of a Matrix 


In all of the examples we have considered thus far we have looked for bases in which no restrictions were imposed on 
the individual vectors in the basis. We now want to focus on the problem of finding a basis for the row space of a matrix 
А consisting entirely of row vectors from А and a basis for the column space of А consisting entirely of column vectors 
of A. 


Looking back on our earlier work, we see that the procedure followed in Example 7 did, in fact, produce a basis for the 
column space of А consisting of column vectors of А, whereas the procedure used in Example 6 produced a basis for the 
row space of А, but that basis did not consist of row vectors of А. The following example shows how to adapt the 
procedure from Example 7 to find a basis for the row space of a matrix that is formed from its row vectors. 


EXAMPLE 9 Basis for the Row Space of a Matrix + 


Find a basis for the row space of 


consisting entirely of row vectors from A. 


Solution We will transpose A, thereby converting the row space of A into the column space of 47; then 
we will use the method of Example 7 to find a basis for the column space of 47; and then we will 
transpose again to convert column vectors back to row vectors. Transposing А yields 


—2 —5 5 6 
АТ=| 0 -3 15 18 
0 —2 10 8 
з 6 0 6 
Reducing this matrix to row echelon form yields 
12 0 2 
01 -5 -10 
00 0 1 
00 0 0 
00 0 0 


The first, second, and fourth columns contain the leading 1's, so the corresponding column vectors in 47 
form a basis for the column space of 47; these are 


1 2 2 

—2 —5 6 

сс=| 0 с2= | —3 and сд == | 18 
0 —2 8 

3 6 6 


Transposing again and adjusting the notation appropriately yields the basis vectors 
rn-[1 -2 0 0 3], r22[2 —5 —3 —2 6], 
апа 
r4=[2 6 18 8 6] 
for the row space of A. 


Next, we will give an example that adapts the methods we have developed above to solve the following general 
problem іп 2”: 


PROBLEM 


Given a set of vectors S = (v4, v3,..., vk) in А”, find a subset of these vectors that forms a basis for span (S), 
and express those vectors that are not in that basis as a linear combination of the basis vectors. 


EXAMPLE 10 Basis and Linear Combinations + 


(a) Find a subset of the vectors 
уј= (1, = 2, 0,3), v9—(2, = 5, = 3, 6), 
v3 = (0, 1,3,0), v4=(2, = 1,4, = 7), v5—(5, = 8,1, 2) 
that forms a basis for the space spanned by these vectors. 


(b) Express each vector not in the basis as a linear combination of the basis vectors. 


Solution 


(a) We begin by constructing a matrix that has v1, v5, ..., v5 as its column vectors: 


1 20 2 5 
—2 —5 1 -1 -8 
0-3 3 4 1] 
4- 60 7 2 6) 
T JT. FJ T d 
ү] V2 V3 V4 V5 
The first part of our problem can be solved by finding a basis for the column space of this matrix. 


Reducing the matrix to reduced row echelon form and denoting the column vectors of the resulting 
matrix by W1, W2, W3, W4, апа W35 yields 


о м н н 


(6) 
11111 
W, wW W3 W4 Ws 
The leading 1's occur in columns 1, 2, and 4, so by Theorem 4.7.5, 
(wi, W2, W4) 
is a basis for the column space of 6, and consequently, 
(vi, v2, v4) 
Is a basis for the column space of 5. 
(b 


— 


We will start by expressing W3 and W5 as linear combinations of the basis vectors w1, w2, W4. The 
simplest way of doing this is to express W3 and ws in terms of basis vectors with smaller subscripts. 
Accordingly, we will express W3 as a linear combination of №] and 32, and we will express W5 as a 
linear combination of w1, w2, and W4. By inspection of 6, these linear combinations are 


мз = 2W|—W) 
Ws = р Мә Р Д 
We call these the dependency equations. The corresponding relationships in 5 are 
уз = 2v4—v3 
у; = үү? Руд 


The following 1s a summary of the steps that we followed in our last example to solve the problem posed above. 
Basis for Span(S) 

Step 1. Form the matrix A having vectors in 5 = (v4, v2,..., vk) as column vectors. 

Step 2. Reduce the matrix A to reduced row echelon form R. 

Step 3. Denote the column vectors of R by №], W2, ..., Wi. 


Step 4. Identify the columns of R that contain the leading 1's. The corresponding column vectors of А form a basis for 
span(S). 
This completes the first part of the problem. 


Step 5. Obtain a set of dependency equations by expressing each column vector of R that does not contain a leading 1 
as a linear combination of preceding column vectors that do contain leading 1's. 


Step 6. Replace the column vectors of Ё that appear in the dependency equations by the corresponding column vectors 
of A. 
This completes the second part of the problem. 


Concept Review 


Row vectors 


Column vectors 


Row space 


Column space 


Null space 


General solution 


Particular solution 


Relationships among linear systems and row spaces, column spaces, and null spaces 


Relationships among the row space, column space, and null space of a matrix 


Dependency equations 


Skills 


* Determine whether a given vector is in the column space of a matrix; if it is, express it as a linear 
combination of the column vectors of the matrix. 


* Find a basis for the null space of a matrix. 
* Find a basis for the row space of a matrix. 
* Find a basis for the column space of a matrix. 


* Find a basis for the span of a set of vectors in R”. 


Exercise Set 4.7 


1. List the row vectors and column vectors of the matrix 


2—10 1 
3 57 =l 
1 42 7 


Answer: 


тү= (2, -1,0,1), ®з= (2, 5,7, —1), з= (1,4,2,7); 


2 -1 0 1 
сб=|3|,с;=| 5 |, сєз=|7У|, сд=|—1 
1 4 2 7 


2. Express the product 4x as a linear combination of the column vectors of A. 


w 


. Determine whether h is in the column space of A, and if so, express as a linear combination of the column vectors 
of A. 


(b) E158 -1 
A-|10 1[ b=| 0 
$173 2 

(c) 1-11 5 
A-|9 3 1| b=| 1 
[ p = 
(d) E А 2 
А= 1—1; Ь=[0 
zd 0 

(e) 1201 4 
aly ae’ as 5 Be 
ЕГ КЕ Шш 
071 25.9 7 


Answer: 


ИСНЕ 


(b) b is not in the column space of A. 


(с) [1 =i] 11 5 
9|-3| 3|4|1|-| 1 
1 NM MES 
(à) [2 1 -1 1 
о|=| 1|+@-—-1)| 1|+4—1 
0| |-1 -1 1 
(e) [4 1 2 0 1 
3 0 1 2 1 
5|" —2|у|+13|5|—1|1|+4|5 
7 0 1 2 2 


4. Suppose that x = = 1, x3 = 2, x3 = 4, x4 = — 3 is a solution of a nonhomogeneous linear system Ах — h and that 
the solution set of the homogeneous system Ах = Q is given by the formulas 
xy=—3r+4s, x3—r-—s, Xa3—r, X4—S 


(a) Find a vector form of the general solution of Ах =Q. 


(b) Find a vector form of the general solution of Ах = bh. 


5. In parts (a)-(d), find the vector form of the general solution of the given linear system 4x — h; then use that result to 
find the vector form of the general solution of Ах — 0. 
(а) x1—3x2-1 

2x|—6x3—2 
(b х1+х2 + 2х3 = 5 
х + хз= —2 
2x1 х2 + 3х3 = 3 
(с) X1—2x24 х3 + 2х4= —1 
2x1 —4x3--2xa3--4x4— —2 
—Xj-2x3— x3—2x4-— ] 
3x1 —6x3--3x3-46x4— – 3 
(d) xX, 2х3 = 3х3 x4= 4 
= 2x + x+ 2x3 + x4= —1 
= х] хә = х3 + 2x4 = 3 


4x, = 7x3 —5x4-— —5 
Answer: 
(a) |1 t 3 |. t 3 
BE 141 
(b) | =2 -1 -1 
7 |-4-£| —1]; {| —1 
0 1 1 
(c) | =1 2 —1 —2 2 —1 —2 
0 1 0 0 1 0 0 
ol ap К 1] oprolt oy Е o 
0 0 0 1 0 0 1 
(d) | 6 Yi 1. EA T 
5 5 5 5 5 
Fi 4 _3 1.14 E 
0 1 0 1 0 
0 0 1 0 1 


6. Find a basis for the null space of A. 
(a) 1-1 3 
А=|5 —4 -4 
7 —6 2 
(b) 20 -1 
4= |4 
0 


(с) 1452 
А=| 2130 
uli 5 
(d) 1 4 5 6 9 
8707 do Аш] 
mio duel eue] 
2 3 5 7 8 
(e) 1 -3 2 2 1 
Qv 3 6 0—3 
А=| 2 -3 —2 4 4 
3-6 0 6 5 
up" vB 15 add anb 


7. In each part, a matrix in row echelon form 1s given. By inspection, find bases for the row and column spaces of A. 


(а) |1 0 2 
00 1 
00 0 

(b) |1 -30 0 
0 100 
0 ооо 
0 00 0 

(с) |1 2 4 5 
01 -3 0 
0 0 1 -3 
0 0 0 1 
0 0 0 0 

(9 |12 =1 5 
0 1 4 3 
0 0 1—7 
0 0 0 1 

Answer: 

(a) 


1 
ri = [10 2], r;- [001], є = |0 |, a= 
0 


ore № 


(Ы) 1 8 
0 1 

rı=[1 300], 12= [0100], i-o, e2=| q 
0 0 
00 


(с) г = [1245], га = [01 – 30], 13 = [001 —3], r4= [000 1], 


a 

= 

I 
ооо о н 

a 

t3 

| 
oo о н м 

a 

Uu 

I 


(d r1=[12 —15], rz= [0143], rz=[001 —7], r4— [000 1] 


1 2 —1 5 
E OY а= T reae ере 3 
0 0 1 —7 
0 0 0 1 


8. For the matrices in Exercise 6, find a basis for the row space of А by reducing the matrix to row echelon form. 


9. By inspection, find a basis for the row space and a basis for the column space of each matrix. 


(а) |1 0 2 


001 
00 0 
(D|1 -30 0 
0 100 
0 ооо 
0 000 
(с) |12 4 5 
0 1 -3 0 
0 0 1 -3 
0 0 0 1 
0 0 0 0 
(9 |12 =1 5 
0 1 4 3 
0 0 1—7 
0 0 0 1 
Answer: 
(a) 


1 2 
гү= [1 0 2]; r;2[0 0 1]; = |0 |; = |1 
0 0 


(b) 
rn-—[1 -30 0]; r;,2[0 1 0 0]; є = 


(с) m=[1 2 4 5]; r22[0 1 -3 0]; r3-[ 


| с 
0 


14 


тд=[0 0 0 1]; q= 


ооо о н 
w © © or 


(@ = [12 —1 5];г=[ 


1 2 —1 5 
0 1 4 3 
ra—[0 0 0 1]; q= ol =la B=] 1h = 5 
0 0 0 1 


10. For the matrices in Exercise 6, find a basis for the row space of А consisting entirely of row vectors of A. 
11. Find a basis for the subspace of 24 spanned by the given vectors. 
(a) (1,1, 24, = 3), (2, 0,2, – 2), (2, — 1, 3, 2) 


(b) (— 1, 1, —2,0), (3, 3, 6, 0), (9, 0, 0, 3) 
(с) (1, 1, 0, 0), 10, U, 1, 1), C= 2,0, 2, 2), (0, = 3, 0, 3) 


Answer: 
@ (111, —4—3), (0,1, —5, —2), (6.0.1, -i) 
(b) (1, —1,2,0), (0, 1,0, 0), (0, oi; -1) 


(c) (1, 1, 0, 0), (0, 1, 1, 1), (0, 0, 1, 1), (0, 0, 0, 1) 


12. Find a subset of the vectors that forms a basis for the space spanned by the vectors; then express each vector that 15 
not in the basis as a linear combination of the basis vectors. 
(а) V1 = (1, 0, 1, 1), v292(—73, 3, 7, 1), vg3=(— 1, 3, 9, 3), va = (— 5, 3, 5, — 1) 
(b) V1 = (1, = 2, 0, 3), жз = (2, —4, 0, 6), з= (71,1, 2,0), v4— (0, – 1, 2, 3) 
(с) Vi = (1, – 1, 5, 2), = (2, 3, 1, 0), уз = (4, — 5, 9, 4), v4= (0,4, 2, – 3), v5 = ( — 7, 18, 2, — 8) 


13. Prove that the row vectors of an » x » invertible matrix А form a basis for R”. 


14. Construct a matrix whose null space consists of all linear combinations of the vectors 


1 2 
v= Е and ъз = р 
2 4 
15. (а) Let 
010 
A=/1 0 0 
000 


Show that relative to an x yz-coordinate system in 3-space the null space of A consists of all points on the z-axis 
and that the column space consists of all points їп the xy-plane (see the accompanying figure). 


(b) Find a 3 x 3 matrix whose null space is the x-axis and whose column space is the yz-plane. 


Null space of A 


Column space 
of A 


Figure Ex-15 


16. Find a 3 x 3 matrix whose null space is 
(a) a point. 
(b) a line. 
(c) a plane. 
17. (a) Find all 2 x 2 matrices whose null space is the line 3x — 5y = 0. 
(b) Sketch the null spaces of the following matrices: 


^l s) Lo 5} 
„БП! 


Answer: 


(a) bs a for all real numbers a, b not both 0. 


(b) Since А and В are invertible, their null spaces are the origin. The null space of C is the line 3x + y = 0. The null 
space of D is the entire xy-plane. 


18. The equation x + x3 ++ x3 = 1 can be viewed as a linear system of one equation in three unknowns. Express its 
general solution as a particular solution plus the general solution of the corresponding homogeneous system. 
[Suggestion: Write the vectors in column form.] 


19. Suppose that A and В are » x д matrices and А 15 invertible.Invent and prove a theorem that describes how the row 
spaces of 45 and В are related. 


True-False Exercises 

In parts (a)-(j) determine whether the statement 15 true or false, and justify your answer. 

(a) The span of v4, ..., ¥y is the column space of the matrix whose column vectors are v, ..., Vy. 
Answer: 


True 


(b) The column space of a matrix А 1s the set of solutions of Ах — b. 
Answer: 


False 


(c) If R is the reduced row echelon form of A, then those column vectors of R that contain the leading 1's form a basis for 
the column space of A. 


Answer: 


False 


(d) The set of nonzero row vectors of a matrix A 15 a basis for the row space of A. 
Answer: 


False 


(e) If A and B are » x; » matrices that have the same row space, then А and B have the same column space. 
Answer: 


False 


(f) If E is an ру x р; elementary matrix and А is an jj; x у matrix, then the null space of E A is the same as the null space 
of A. 


Answer: 


True 


If E is апуу; elementary matrix and A is an jj x м matrix, then the row space of E A is the same as Ше row space 
g PX m Ty MXA р 
of A. 


Answer: 


True 


(h) If E is an ja ж: elementary matrix and А is ап; x » matrix, then the column space of E A is the same as the column 
space of А. 


Answer: 


False 


(i) The system 4x = h is inconsistent if and only if h is not in the column space of A. 
Answer: 


True 


(j) There 15 an invertible matrix А and a singular matrix B such that the row spaces of А and В are the same. 
Answer: 


False 


Copyright (O 2010 John Wiley & Sons, Inc. All rights reserved. 


4.8 Rank, Nullity, and the Fundamental Matrix 
Spaces 


In the last section we investigated relationships between a system of linear equations and the row space, column 
space, and null space of its coefficient matrix. In this section we will be concerned with the dimensions of those 
spaces. The results weobtain will provide a deeper insight into the relationship between a linear system and its 
coefficient matrix. 


Row and Column Spaces Have Equal Dimensions 


In Examples 6 and 7 of Section 4.7 we found that the row and column spaces of the matrix 


1-53 4—2 5 4 
2-6 9-1. 8 2 
2-6 9-19 7 
xj 3 4 2 5 d 


both have three basis vectors and hence are both three-dimensional. The fact that these spaces have the same 
dimension is not accidental, but rather a consequence of the following theorem. 


THEOREM 4.8.1 


The row space and column space of a matrix A have the same dimension. 


Proof Let R be any row echelon form of A. It follows from Theorem 4.7.4 and Theorem 4.7.6 b that 


dim(row space of A) = dim(row space of А) 
dim(column space of А) = dim(column space of А) 


so it suffices to show that the row and column spaces of R have the same dimension. But the dimension of the row 
space of R is the number of nonzero rows, and by Theorem 4.7.5 the dimension of the column space of R is the 
number of leading 1's. Since these two numbers are the same, the row and column space have the same dimension. 


Rank and Nullity 


The dimensions of the row space, column space, and null space of a matrix are such important numbers that there is 
some notation and terminology associated with them. 


DEFINITION 1 


The common dimension of the row space and column space of a matrix A is called the rank of A and is 
denoted by rank(A); the dimension of the null space of A is called the nullity of A and is denoted by 
nullity(A). 


The proof of Theorem 4.8.1 shows that the rank 
of A can be interpreted as the number of leading 
l's in any row echelon form of A. 


EXAMPLE 1 Rank and Nullity of a 4 x 6 Matrix + 


Find the rank and nullity of the matrix 


1 20 4 $3 
3-712 0 1 4 
2-52 4 6 1 
4 —9 2 -4 -4 7 


Solution The reduced row echelon form of A is 
10 -4 -28 -37 13 
01 -2 -12 -16 5 А 
00 0 0 оо () 
00 0 0 0 0 


(verify). Since this matrix has two leading 1's, its row and column spaces are two-dimensional and 
rank (A) = 2. To find the nullity of A, we must find the dimension of the solution space of the linear 
system 4x — (). This system can be solved by reducing its augmented matrix to reduced row echelon 
form. The resulting matrix will be identical to 1, except that it will have an additional last column of 
zeros, and hence the corresponding system of equations will be 

xy = 4x3 = 28x4 = 37x5; + 13х6 = 0 

x; = 2x3 = 12хд = lx; 4 5xg = 0 


Solving these equations for the leading variables yields 


Xj = 4x3 + 28x4 + 37x5 = 13x6 
хз = 2x3 + 12x4 + lx; = 5х6 Q) 
from which we obtain the general solution 
хр = 4r+ 2854+ 37t 13и 
xa = 2r+ 125+ 16¢— 5и 
X3 = Р 
X4 = 5 
X5 = 1 
X6 = и 


or in column vector form 


x 4 28 37 15 
х2 2 12 16 —5 
жые ag] ЫРК РЕ ШЕ, (3) 
х4 0 1 0 0 
х5 0 0 1 0 
х6 0 0 0 1 


Because the four vectors on the right side of 3 form a basis for the solution space, nullity(A) = 4. 


EXAMPLE 2 Maximum Value for Rank + 


What is the maximum possible rank of an jj x д matrix A that is not square? 


Solution Since the row vectors of A lie in R” and the column vectors in 2?”, the row space of A is 
at most n-dimensional and the column space is at most m-dimensional. Since the rank of A is the 
common dimension of its row and column space, it follows that the rank is at most the smaller of m 
and n. We denote this by writing 


rank( A) < minnz, м) 


in which min (92, x) is the minimum of m and n. 


The following theorem establishes an important relationship between the rank and nullity of a matrix. 


THEOREM 4.8.2 Dimension Theorem for Matrices 


If A is a matrix with n columns, then 


rank (4) + nullity(4) =» (4) 


Proof Since A has n columns, the homogeneous linear system 4x — () has n unknowns (variables). These fall into 
two distinct categories: the leading variables and the free variables. Thus, 


variables variables 


p Had ; ie 2 = 


But the number of leading variables is the same as the number of leading 1's in the reduced row echelon form of A, 
which is the rank of A; and the number of free variables is the same as the number of parameters in the general 
solution of 4x = 0, which is the nullity of A. This yields Formula 4. 


EXAMPLE 3 The Sum of Rank and Nullity + 


The matrix 


«p 20 4 5—3 
3-7 9. 0 1 4 
2 -52 4 6 1 
4—9 2—4 4 7 


has 6 columns, so 
rank (4) + nullity(.4) = 6 
This is consistent with Example 1, where we showed that 
rank(4)}=2 and nullity(.4) =4 


The following theorem, which summarizes results already obtained, interprets rank and nullity in the context of a 
homogeneous linear system. 


THEOREM 4.8.3 


If A is an р x y matrix, then 


(а) rank(.4) = the number of leading variables in the general solution of Ax = 0. 
(b) nullity(.4) = the number of parameters in the general solution of Ах = 0 


EXAMPLE 4 Number of Parameters in a General Solution + 
Find the number of parameters in the general solution of 4x = Q if A is a 5 x 7 matrix of rank 3. 


Solution From 4, 
nullity(.4) = x —rank(.4) — 7 —3—4 


Thus there are four parameters. 


Equivalence Theorem 


In Theorem 2.3.8 we listed seven results that are equivalent to the invertibility of a square matrix А. We are now in 
a position to add eight more results to that list to produce a single theorem that summarizes most of the topics we 
have covered thus far. 


THEOREM 4.8.4 Equivalent Statements 


If A is an y x » matrix, then the following statements are equivalent. 
(a) A is invertible. 

(b) Ax = Q has only the trivial solution. 

(c) The reduced row echelon form of A is 7,,. 

(d) Ais expressible as a product of elementary matrices. 

(е) Ах = is consistent for every » x | matrix h. 

(f Ах=Ъ has exactly one solution for every » x | matrix h. 
(g) det(A) #0. 

(h) The column vectors of A are linearly independent. 

(i) The row vectors of A are linearly independent. 

(7) The column vectors of A span 2”. 

(k) The row vectors of A span R”. 

(l) The column vectors of A form a basis for R”. 

(m) The row vectors of A form a basis for R”. 

(n) A has rank n. 

(0) A has nullity 0. 


Proof The equivalence of (й) through (72) follows from Theorem 4.5.4 (we omit the details). To complete the 
proof we will show that (5), (з), and (о) are equivalent by proving the chain of implications 


(b) = (о) = (и) = (5). 


(b) = (о) If 4x = Q has only the trivial solution, then there are no parameters in that solution, so nullity (A) = 0 
by Theorem 4.8.3 b. 


(о) =» (я) Theorem 4.8.2. 


(м) = (b) If A has rank n, then Theorem 4.8.3a implies that there are n leading variables (hence no free variables) 
in the general solution of Ах = 0. This leaves the trivial solution as the only possibility. 


Overdetermined and Underdetermined Systems 


In many applications the equations in a linear system correspond to physical constraints or conditions that must be 
satisfied. In general, the most desirable systems are those that have the same number of constraints as unknowns, 
since such systems often have a unique solution. Unfortunately, it is not always possible to match the number of 
constraints and unknowns, so researchers are often faced with linear systems that have more constraints than 
unknowns, called overdetermined systems, or with fewer constraints than unknowns, called underdetermined 
systems. The following two theorems will help us to analyze both overdetermined and underdetermined systems. 


In engineering and other applications, the 
occurrence of an overdetermined or 
underdetermined linear system often signals that 
one or more variables were omitted in formulating 
the problem or that extraneous variables were 
included. This often leads to some kind of 
undesirable physical result. 


THEOREM 4.8.5 


If Ах = b is a consistent linear system of m equations in n unknowns, and if A has rank r, then the general 
solution of the system contains з — ғ parameters. 


Proof It follows from Theorem 4.7.2 that the number of parameters is equal to the nullity of 4, which, by 
Theorem 4.8.2, is y = r. 


THEOREM 4.8.6 


Let A be ап; ху matrix. 

(a) (Overdetermined Case) If jj; > у, then the linear system Ах = h is inconsistent for at least one vector 
b in R”. 

(b) (Underdetermined Case) If jj; = з, then for each vector Ь in 2” the linear system Ах — h is either 
inconsistent or has infinitely many solutions. 


Proof (a) Assume that jj; > », in which case the column vectors of А cannot span Д" (fewer vectors than the 
dimension of 2”). Thus, there is at least one vector h in 2”? that is not in the column space of A, and for that h the 
system 4x = } is inconsistent by Theorem 4.7.1. 


Proof (b) Assume that jj; = ». For each vector h in R” there are two possibilities: either the system Ах = h is 
consistent or it is inconsistent. If it is inconsistent, then the proof is complete. If it is consistent, then Theorem 4.8.5 
implies that the general solution has у — ғ parameters, where r = rank (A). But rank (A) is the smaller of m and n, 
SO 


#—г=п—й >й 
This means that the general solution has at least one parameter and hence there are infinitely many solutions. 


EXAMPLE 5 Overdetermined and Underdetermined Systems + 


(a) What can you say about the solutions of an overdetermined system Ах = h of 7 equations in 5 
unknowns in which A has rank r = 4? 


(b) What can you say about the solutions of an underdetermined system 4x = h of 5 equations in 7 
unknowns in which A has rank r = 4? 


Solution 

(a) The system is consistent for some vector Ь іп 27, and for any such Ь the number of parameters in 
the general solutionis; = r= 5 = 4 = 1. 

(b) The system may be consistent or inconsistent, but if it is consistent for the vector Ь in 2°, then the 
general solution has y = r = 7 = 4 = 3 parameters. 


EXAMPLE 6 An Overdetermined System + 


The linear system 


xy = 2х2 = | 
xj = *2 = b 
х\ + x2 = 23 
X| + 2x27 = bg 
хр + 3x2 = bs; 


15 overdetermined, so it cannot be consistent for all possible values of b1, b», Ёз, Ёд, and bs. Exact 
conditions under which the system is consistent can be obtained by solving the linear system by Gauss— 
Jordan elimination. We leave it for you to show that the augmented matrix is row equivalent to 


1 0 2b = by 
0 1 b3 — by 
0 0 b} — 3b + 2b] (5) 
0 0 b4 = 425 + 3h 
0 0 b; = 5b) + 4h 
Thus, the system is consistent if and only if $1, $2, b3, Ёд, and bs satisfy the conditions 

2b| = 355) + ёз = 0 

3b, = 423 H ёд = 0 

4b, = 523 + bs = 0 


Solving this homogeneous linear system yields 
by =S5r—4s, bj—4r—3s, b4—2r—s, bg=r, b5=8 


where r and s are arbitrary. 


Remark The coefficient matrix for the linear system in the last example has » — 2 columns, and it has rank p = 2 
because there are two nonzero rows in its reduced row echelon form. This implies that when the system is 
consistent its general solution will contain y — ғ = () parameters; that is, the solution will be unique. With a 
moment's thought, you should be able to see that this is so from 5. 


The Fundamental Spaces of a Matrix 


There are six important vector spaces associated with a matrix А and its transpose 47: 
row space of A row space of AT 
column space of А column space of AT 


null space of A null space o£ AT 


However, transposing a matrix converts row vectors into column vectors and conversely, so except for a difference 
in notation, the row space of 47 is the same as the column space of А, and the column space of 47 is the same as 


the row space of А. Thus, of the six spaces listed above, only the following four are distinct: 


row space of 4 column space of A 


null space of 4 null space of АТ 


If A is an р x y matrix, then the row space and 
null space of A are subspaces of R”, and the 
column space of A and the null space of 47 are 


subspaces of R"*. 


These are called the fundamental spaces of a matrix А. We will conclude this section by discussing how these four 
subspaces are related. 


Let us focus for a moment on the matrix 47. Since the row space and column space of a matrix have the same 
dimension, and since transposing a matrix converts its columns to rows and its rows to columns, the following 
result should not be surprising. 


THEOREM 4.8.7 


If A is any matrix, then rank (4) — rank (4 7 


Proof 
— 4 _ A Ti T 
rank (A) = dim(row space of А) = dim (column space of A | = rank (4 | 


This result has some important implications. For example, if A is an р x з matrix, then applying Formula 4 to the 
matrix 47 and using the fact that this matrix has m columns yields 


rank (47) nullity 47) =m 


which, by virtue of Theorem 4.8.7, сап be rewritten as 
rank (4) | nullity (47) = m (6) 

This alternative form of Formula 4 in Theorem 4.8.2 makes it possible to express the dimensions of all four 
fundamental spaces in terms of the size and rank of A. Specifically, if rank(.4) = ғ, then 

dim [row(.4)] =r dim[col(.4)] =r 

. : 7 

dim[null(.4)] = —7 ашаа) =m —r (?) 
The four formulas in 7 provide an algebraic relationship between the size of a matrix and the dimensions of its 
fundamental spaces. Our next objective is to find a geometric relationship between the fundamental spaces 
themselves. For this purpose recall from Theorem 3.4.3 that if A is an ру x м matrix, then the null space of А 


consists of those vectors that are orthogonal to each of the row vectors of А. To develop that idea in more detail, we 
make the following definition. 


DEFINITION 2 


If W is a subspace of R", then the set of all vectors in 5” that are orthogonal to every vector in W is called 
the orthogonal complement of W and is denoted by the symbol [7 +. 


The following theorem lists three basic properties of orthogonal complements. We will omit the formal proof 
because a more general version of this theorem will be given later in the text. 


THEOREM 4.8.8 


If W is a subspace of E", then: 
(a) W іѕ a subspace of R”. 
(b) The only vector common to W and yy + is 0. 


(c) The orthogonal complement of ў L is W. 


EXAMPLE 7 Orthogonal Complements + 


In 22 the orthogonal complement of a line W through the origin is the line through the origin that is 
perpendicular to W (Figure 4.8.1а); and in 2? the orthogonal complement of a plane W through the 
origin is the line through the origin that is perpendicular to that plane (Figure 4.8.15). 


| 

| 
чу. 
7 
4 


N 
AA 


(a) (b) 
Figure 4.8.1 


Explain why (0) and R" are orthogonal 
complements. 


A Geometric Link Between the Fundamental Spaces 


The following theorem provides a geometric link between the fundamental spaces of a matrix. Part (a) is essentially 
a restatement of Theorem 3.4.3 in the language of orthogonal complements, and part (5), whose proof is left as an 
exercise, follows from part (a). The essential idea of the theorem is illustrated in Figure 4.8.2. 


THEOREM 4.8.9 


If A is an jo; ж зу matrix, then: 
(a) The null space of А and the row space of А are orthogonal complements in R”. 


(b) The null space of 47 and the column space of A are orthogonal complements in 2”. 


Figure 4.8.2 


More on the Equivalence Theorem 


Аз our final result in this section, we will add two more statements to Theorem 4.8.4. We leave the proof that those 
statements are equivalent to the rest as an exercise. 


THEOREM 4.8.10 Equivalent Statements 


If A is an y x » matrix, then the following statements are equivalent. 
(a) A is invertible. 

(b) Ax = Q has only the trivial solution. 

(c) The reduced row echelon form of A is žy. 

(d) Ais expressible as a product of elementary matrices. 

(е) Ax = is consistent for every » x | matrix h. 

(9 Ax=b has exactly one solution for every » ж 1 matrix b. 
(е) det(A) #0. 

(h) The column vectors of A are linearly independent. 

(i) The row vectors of A are linearly independent. 

(j The column vectors of А span R". 

(k) The row vectors of A span R”. 

(1) The column vectors of A form a basis for R”. 

(m) The row vectors of A form a basis for R”. 

(n) A has rank з. 

(0) A has nullity 0. 

(p) The orthogonal complement of the null space of A is R”. 
(4) The orthogonal complement of the row space of A is {0}. 


Applications of Rank 


The advent of the Internet has stimulated research on finding efficient methods for transmitting large amounts of 
digital data over communications lines with limited bandwidths. Digital data are commonly stored in matrix form, 
and many techniques for improving transmission speed use the rank of a matrix in some way. Rank plays a role 
because it measures the “redundancy” in a matrix in the sense that if A is an jo; x у matrix of rank k, then — ic of 
the column vectors and р; — д of the row vectors can be expressed in terms of k linearly independent column or 
row vectors. The essential idea in many data compression schemes is to approximate the original data set by a data 
set with smaller rank that conveys nearly the same information, then eliminate redundant vectors in the 
approximating set to speed up the transmission time. 


Concept Review 

* Rank 

* Nullity 

* Dimension Theorem 

* Overdetermined system 
* Underdetermined system 


* Fundamental spaces of a matrix 


Relationships among the fundamental spaces 


* Orthogonal complement 


Equivalent characterizations of invertible matrices 


Skills 
* Find the rank and nullity of a matrix. 


* Find the dimension of the row space of a matrix. 


Exercise Set 4.8 


1. Verify that rank (4) — rank (4^). 


Answer: 


Rank(A) = Rank(A7) =2 


2. Find the rank and nullity of the matrix; then verify that the values obtained satisfy Formula 4 in the Dimension 
Theorem. 


(a) 1 -1 3 
А=|5 —4 —4 
7 —6 2 
(b) 2 0 =] 
А=|4 0 —2 
00 0 
(c) 1452 
А=| 2130 
"E 2 2% 
(d) 1 4 5 6 9 
3 do 4 =] 
AS d 0 —1 —2 —1 
2 3 5 7 8 


(е) 1-5 2 2 1 
0 5 6 Q0 -3 

А=| 2-3 -2 4 4 
3-6 0 6 5 


. In each part of Exercise 2, use the results obtained to find the number of leading variables and the number of 
parameters in the solution of 4x = Q without solving the system. 


Answer: 


(a) 2; 1 
(b) 1;2 
(c) 2;2 
(d) 2;3 
(e) 3;2 


. In each part, use the information in the table to find the dimension of the row space of A, column space of А, 
null space of A, and null space of 47. 


(a) | j (0| (@) (9 (g) 


Size of A| 3х3 | 3x3| 3x3] 5x9| 9x5 Y 6x2 
Rank(A) 2 


. In each part, find the largest possible value for the rank of А and the smallest possible value for the nullity of A. 
(а) 4is4 x4 
(b) 4153х 5 
(c) 4is5x 3 


Answer: 


(a) Rank — 4, nullity — 0 

(b) Rank — 5, nullity — 2 

(c) Rank = 3, nullity = 0 

. If A is an jg x y matrix, what is the largest possible value for its rank and the smallest possible value for its 
nullity? 


. In each part, use the information in the table to determine whether the linear system Ах = } is consistent. If so, 
state the number of parameters in its general solution. 


(а) (b) (c) (d) (e) (f) 
Size of A 3x3| 35x3| 3x3| 5x9| 5x9| 4x4| 6x2 
Rank (4) 3 2 1 2 2 0 2 
Rank[A|b]| 3 3 1 2 3 0 2 


10. 


1 


= 


12. 


Answer: 


(a) Yes, 0 
(b) No 

(c) Yes, 2 
(d) Yes, 7 
(e) No 

(f) Yes, 4 
(g) Yes, 0 


. For each of the matrices in Exercise 7, find the nullity of А, and determine the number of parameters in the 


general solution of the homogeneous linear system 4x — 0. 


‚ What conditions must be satisfied by 51, 22, ёз, ёд, and 55 for the overdetermined linear system 
Хр= 3хэ = Р 
x1 = 2х9 = 53 
x1 х2 = 23 
х= 4х2 = #4 
x1 5х3 = 55 


to be consistent? 


Answer: 


bi =r, з=, Ёз = Ав — 5r, b4—2r— 5$, b5— 8s — Tr 
Let 


а [21 412 413 
421 422 423 


Show that А has rank 2 if and only if one or more of the determinants 
411 212 411 213 212 213 
421 4272 421 223 422 223 


E] › 


is nonzero. 


. Suppose that А is a 3 x 3 matrix whose null space is a line through the origin in 3-space. Can the row or column 


space of A also be a line through the origin? Explain. 


Answer: 
No 
Discuss how the rank of A varies with t. 
(a) 1d 
A=|1¢ 1 
ft. 1 
(b) É 3 =l 


13. 


14. 


15. 
16. 


17. 


18. 


19. 


Ате there values of r and s for which 


1 0 0 
0r-2 2 
0 5—1 +2 
0 0 3 


has rank 1? Has rank 2? If so, find those values. 
Answer: 


Rank is 2 if = 2 and = = 1; the rank is never 1. 


Use the result in Exercise 10 to show that the set of points (x, y, z) in R? for which the matrix 
X »z 
1 х у 

has rank 1 is the curve with parametric equations x = #, у = i^. z — 15. 


Prove: If ғ 0, then A and КА have the same rank. 


(a) Give an example of a 3 x% 3 matrix whose column space is a plane through the origin in 3-space. 
(b) What kind of geometric object is the null space of your matrix? 


(c) What kind of geometric object is the row space of your matrix? 


(a) If Ais a 3 x 5 matrix, then the number of leading 1's in the reduced row echelon form of A is at most 


. Why? 

(b) If A is a 3 x 5 matrix, then the number of parameters in the general solution of 4x = Q is at most 
. Why? 

(c) If Ais a 5 x 3 matrix, then the number of leading 1's in the reduced row echelon form of A is at most 
. Why? 

(d) If A is a 5 x 3 matrix, then the number of parameters in the general solution of 4x = Q is at most 
. Why? 

Answer: 

(a) 3 

(b) 5 

(c) 3 

(d) 3 

(a) If A is a 3 x 5 matrix, then the rank of A is at most . Why? 

(b) If A is a 3 x; 5 matrix, then the nullity of A is at most . Why? 

(c) If 4 is a 3 5 matrix, then the rank of 47 is at most . Why? 

(d) If A is a 3 x 5 matrix, then the nullity of 47 is at most . Why? 


Find matrices А and B for which гапк( 4) = гапк( 5), but rank (4^) # rank (2 , |. 


Answer: 


0 1 1 2 
ы j| =|; d 
20. Prove: If a matrix A is not square, then either the row vectors or the column vectors of A are linearly dependent. 
True-False Exercises 
In parts (a)-(]) determine whether the statement is true or false, and justify your answer. 
(a) Either the row vectors or the column vectors of a square matrix are linearly independent. 


Answer: 


False 


(b) A matrix with linearly independent row vectors and linearly independent column vectors is square. 
Answer: 


True 


(c) The nullity of a nonzero x x у matrix is at most m. 
Answer: 


False 


(d) Adding one additional column to a matrix increases its rank by one. 
Answer: 


False 


(e) The nullity of a square matrix with linearly dependent rows is at least one. 
Answer: 


True 


(f) If A is square and Ах — h is inconsistent for some vector h, then the nullity of A is zero. 
Answer: 


False 


(g) If a matrix A has more rows than columns, then the dimension of the row space is greater than the dimension of 
the column space. 


Answer: 
False 

(hb) уе rank (АТ) — rank (4), then А is square. 
Answer: 


False 


(i) There is no 3 x 3 matrix whose row space and null space are both lines in 3-space. 


Answer: 


True 


(j) If V is a subspace of R” and W is a subspace of V, then Jy іѕ a subspace of 7 +. 
Answer: 


False 
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4.9 Matrix Transformations from R” to Р” 


In this section we will study functions of the form w = F (x), where the independent variable x is a vector in E" and the 
dependent variable үү is a vector in R". We will concentrate on a special class of such functions called “matrix 
transformations." Such transformations are fundamental in the study of linear algebra and have important applications 
ш physics, engineering, social sciences, and various branches of mathematics. 


Functions and Transformations 


Recall that a function is a rule that associates with each element of a set A one and only one element in a set B. If f 
associates the element 5 with the element a, then we write 


b= f (a) 
and we say that b is the image of a under f or that f (a) is the value of f at a. The set А is called the domain of f and the 


set B the codomain of f (Figure 4.9.1). The subset of the codomain that consists of all images of points in the domain is 
called the range of f. 


b = f(a) 
Domain Codomain 
А В 


Figure 4.9.1 


For many common functions the domain and codomain are sets of real numbers, but in this text we will be concerned 
with functions for which the domain and codomain are vector spaces. 


DEFINITION 1 


If V and W are vector spaces, and if fis a function with domain V and codomain W, then we say that fis a 
transformation from V to W or that f maps V to W, which we denote by writing 


уз 


In the special case where [7 — W, the transformation is also called an operator on V. 


In this section we will be concerned exclusively with transformations from R” to R™; transformations of general vector 
spaces will be considered in a later section. To illustrate one way in which such transformations can arise, suppose that 
Íi: 2, --- J т аге real-valued functions of n variables, say 


Wy E Fita) 
wj == f 21. X2... Xn) (1) 


Wm = (х1, X2... Xn) 


These m equations assign a unique point (у, W2, ..., Wm) in R” to each point (x1, x2, -- x4) in R" and thus define a 
transformation from R" to R™. If we denote this transformation by T, then T: R” — R" and 


ТХ], X2, ..., Хи) == (1, Wars Wm) 


Matrix Transformations 


In the special case where the equations in 1 are linear, they can be expressed in the form 


wi = ах + 1252 + ttt + айлу 
#2 = ах 4 й22Х2 + кож | amyn 0) 
Wm = ami P amx + * ухи 
which we can write in matrix notation as 
Wy 411 412 "^"^" ды АІ 
w2 4271 422 ``“ Gm | 2 
ili: i TENE (3) 
Wm Gm] @m2 "^"^" Amn || Xn 
or more briefly as 
w — Ах (4) 


Although we could view this as a linear system, we will view it instead as a transformation that maps the column vector 
x in R” into the column vector w in R™ by multiplying x on the left by A. We call this a matrix transformation (or 
matrix operator if юр = қ), and we denote it by T 4; А” — R™. With this notation, Equation 4 can be expressed as 


w= Talx) (5) 


The matrix transformation 7 д is called multiplication by A, and the matrix A is called the standard matrix for the 
transformation. 


We will also find it convenient, on occasion, to express 5 in the schematic form 
x Aw (6) 
which is read “T 4 maps x into w.” 
EXAMPLE 1 A Matrix Transformation from R^ to R? < 


The matrix transformation T- R4 _, R? defined by the equations 


Wy = 2x,—23x34x3—95x4 
wa = 4х] + х2 = 2х3 + х4 (7) 
w3 = 5х = х2 4х3 


can be expressed in matrix form as 


№1 2-3 1 = х3 
№2 [= |4 1 -2 1 хз (8) 
W3 5-1] 4 0 X4 
so the standard matrix for T is 
2 —3 1 —5 
А= |4 1 -2 1 
5-1] 4 0 


The image of a point (x1, х2, x3, x 4) can be computed directly from the defining equations 7 or from 8 
by matrix multiplication. For example, if 
(x1, X2, 13, x4) = (1, — 3, 0, 2) 
then substituting in 7 yields wy = 1, wz = 3, w3 = 8 (verify), or alternatively from 8, 
wi 2 —3 1 —5 1 
w2|=]|4 1 —2 1 =|3 
w3 5 =l 4 0 8 


Some Notational Matters 


Sometimes we will want to denote a matrix transformation without giving a name to the matrix itself. In such cases we 
will denote the standard matrix for 7: R” — R™ by the symbol [7]. Thus, the equation 


T(x) = [Т]х (9) 


is simply the statement that 7 is a matrix transformation with standard matrix [T], and the image of x under this 
transformation is the product of the matrix [7] and the column vector x. 


Properties of Matrix Transformations 


The following theorem lists four basic properties of matrix transformations that follow from properties of matrix 
multiplication. 


THEOREM 4.9.1 


For every matrix A the matrix transformation F 4; R” — R™ has the following properties for all vectors y and y 
in R? and for every scalar k: 


(a) T4(0) 20 
(b) Talku) = KT alu) [Homogeneity property] 
(c) T Au + v) = T A(u) + T д(у) [ Additivity property] 


(d) ТАм—у) = T 4(u) — Tay) 


Proof АП four parts are restatements of familiar properties of matrix multiplication: 


A0 —0, A(ku) —k(A4u), Alu +v) = Aut Av, Alu- v) = du — Av 


It follows from Theorem 4.9.1 that a matrix transformation maps linear combinations of vectors in R" into the 
corresponding linear combinations in R"' in the sense that 


T j(kquj + £2u5 -- + * + + £u) — kT luz) + £27 glug) ++ + c c FELT AQ) (10) 


Depending on whether n-tuples and m-tuples are regarded as vectors or points, the geometric effect of a matrix 
transformation 7 4; R” — R™ is to map each vector (point) in R” into a vector (point) in R™ (Figure 4.9.2). 


к" R" к" к” 


x T(x) x e — 49 


T maps vectors to vectors. T maps points to points. 


Figure 4.9.2 


The following theorem states that if two matrix transformations from R" to R™ have the same image at each point of 
R”, then the matrices themselves must be the same. 


THEOREM 4.9.2 


If T 4 R” — R” and Tg: R” — R” are matrix transformations, and if T 4(x) = T'g(x) for every vector x in А” 
, Шеп 4 = 2. 


Proof To say that T 4(x) = T'g(x) for every vector іп R” is the same as saying that 


Ах = Bx 
for every vector x in R”. This is true, in particular, if x is any of the standard basis vectors е1, ез, ..., ej, for R^; that is, 


Ae; = Ве; (j—1,2,.., 7) (11) 


Since every entry of €j is 0 except for the jth, which is 1, it follows from Theorem 1.3.1 that Ае j is the jth column of A 
and Be j is the jth column of B. Thus, it follows from 11 that corresponding columns of А and В are the same, and hence 


that 4 = B. 


EXAMPLE 2 Zero Transformations + 


If 0 is the р x у zero matrix, then 
Tg(x) = 0x — 0 


so multiplication by zero maps every vector in R” into the zero vector in 2?””. We call 7 the zero 
transformation from R” to R™. 


EXAMPLE 3 Identity Operators <4 


If is the у x м identity matrix, then 
тх) =ix=x 


so multiplication by / maps every vector in R” into itself. We call 7’; the identity operator on R”. 


A Procedure for Finding Standard Matrices 


There is a way of finding the standard matrix for a matrix transformation from R” to R™ by considering the effect of 
that transformation on the standard basis vectors for R". To explain the idea, suppose that 4 is unknown and that 
е], ез, ..., ey 
are the standard basis vectors for R”. Suppose also that the images of these vectors under the transformation T 4 are 
T Ае) = Ае, Taler) = Аел,..‚ Т д(еһ) = Aen 
It follows from Theorem 1.3.1 that Ае j is a linear combination of the columns of А in which the successive coefficients 


are the entries of €j. But all entries of €j are zero except the jth, so the product Ае j is just the jth column of the matrix 
A. Thus, 


A— [T Ae) |T4Ce2)|- >> |Talen)] (12) 


In summary, we have the following procedure for finding the standard matrix for a matrix transformation: 


Finding the Standard Matrix for a Matrix Transformation 


Step 1. Find the images of the standard basis vectors еј, ез, ..., ej, for R” in column form. 


Step 2. Construct the matrix that has the images obtained in Step 1 as its successive columns. This matrix is the 
standard matrix for the transformation. 


Reflection Operators 


Some of the most basic matrix operators on 22 and 23 are those that map each point into its symmetric image about a 


fixed line or a fixed plane; these are called reflection operators. Table 1 shows the standard matrices for the reflections 
about the coordinate axes in 22, and Table 2 shows the standard matrices for the reflections about the coordinate planes 


in #3. In each case the standard matrix was obtained by finding the images of the standard basis vectors, converting 


those images to column vectors, and then using those column vectors as successive columns of the standard matrix. 


Operator 


Reflection about the 
y-acis 
T(x,y) = (—x.y) 


Reflection about the 
x-axis 


T(x, y) = (x. =y) 


Reflection about the line 
y-—x 
T(x, y) = (у, x) 


Operator Illustration 


Reflection about the 
xy-plane 
T(x, y,z) = (x, у, 


Reflection about the 
xz-plane 


T(x, y, z) = (x, = у,2) 


Reflection about the 
yz-plane 
T(x, y,z) = (—x, у,2) 


Illustration 


Table 1 


Table 2 


Standard 
Matrix 


Images of e1 and e2 


T(1, 0) = (—1, 0) 
T(0, 1) — (0, 1) 


T(e1) 
T(e3) 


T(1,0) — (1,0) 
T(0, 1) = (0, — 1) 


T(e1) = 
T(e3) = 


T(1, 0) — (0, 1) 
T(0, 1) — (1, 0) 


Standard 
Matrix 


€1, €2, €3 


T(1, 0, 0) — (1, 0, 0) 
T(0, 1, 0) = (0, 1, 0) 
T(0, 0, 1) = (0,0, = 1) 


T(1, 0, 0) — (1, 0, 0) 
T(0, 1, 0) = (0, — 1,0) 
T(0, 0, 1) = (0, 0, 1) 


T(1, 0, 0) = (—1, 0, 0) 
T(0, 1, 0) — (0, 1, 0) 
T(0, 0, 1) — (0, 0, 1) 


Projection Operators 


Matrix operators on 22 and R? that map each point into its orthogonal projection on a fixed line or plane are called 


projection operators (or more precisely, orthogonal projection operators). Table 3 shows the standard matrices for the 
orthogonal projections on the coordinate axes in RA and Table 4 shows the standard matrices for the orthogonal 


projections on the coordinate planes in 23. 


Table 3 


Operator Illustration Images of e; and e2 Standard 
Matrix 


Orthogonal projection on the р = 1(1,0) = (1, 0) 
х-ахіѕ Т(х, у) = (х, 0) 3 = T(0, 1) = (0, 0) 


Orthogonal projection on the á = T1, 0) = (0,0) 
y-axis T(x, y) = (0, y) T(0, 1) = (0, 1) 


Table 4 
Operator Illustration Images of е, e», ез Standard 
Matrix 
Orthogonal projection on z = T1, 0,0) = (1, 0, 0) 
the xy-plane T(0, 1, 0) = (0, 1, 0) 


T(x, y, z) = (x, у, 0) EET = 7(0,0,1) = (0,0,0) 


Orthogonal projection on ы = 7(1,0,0)=(1,0,0) 
the xz-plane T(0, 1, 0) = (0, 0, 0) 
уд) | = 7(0,0,1)= (0,0,1) 


Orthogonal projection on : TT = 7(1,0,0) = (0, 0, 0) 
the yz-plane x T(0, 1,0) = (0, 1, 0) 
T(x, у,2) = (0, у,2) | -< = 7(0,0,1) = (0,0, 1) 


Rotation Operators 


Matrix operators on 22 and 53 that move points along circular arcs are called rotation operators. Let us consider how 
to find the standard matrix for the rotation operator T- #2 — 52 that moves points counterclockwise about the origin 
through an angle 0 (Figure 4.9.3). As illustrated in Figure 4.9.3, the images of the standard basis vectors are 

F(e,) = T(1, 0) = (cos B, sin 0) and Tez) = T(0, 1) = ( — sin Ø, cos 0) 


so the standard matrix for T is 
0 —sn 
T T _ | cos 
| g al | snf cos 


Figure 4.9.3 


In keeping with common usage we will denote this operator by Rg and call 
(13) 


the rotation matrix for g?. If x = (x, y) 18 a vector in R2, and if w= (у, w2) is its image under the rotation, then the 
relationship w = Rx can be written in component form as 
wy, = xcosÜ — узш 


| 14 
wa = xsinf) + ycosf On 


These are called the rotation equations for д2. These ideas are summarized in Table 5. 


Table 5 
Operator Illustration Rotation Equations | Standard Matrix 
Rotation through an angle @ - (шу. W) wy = хсоѕ0 — ysin | | cos# —sinf 
w2 = xsinf + усозӣ sinf созӣ 


In the plane, counterclockwise angles are positive 
and clockwise angles are negative. The rotation 
matrix for a clockwise rotation of =f radians can be 
obtained by replacing @ by =f in 12. After 
simplification this yields 
re | cosh sin | 
—sinf cos 


EXAMPLE 4 ARotation Operator -* 


Find the image of x = (1, 1) under a rotation of z / 6 radians ( = 30" | about the origin. 


Solution It follows from 13 with 0 =  / 6 that 


Rax = 


or in comma-delimited notation, R;s(1, 1) = (0.37, 1.37). 


Rotations in R? 


A rotation of vectors in 25 is usually described in relation to a ray emanating from the origin, called the axis of 
rotation. As a vector revolves around the axis of rotation, it sweeps out some portion of a cone (Figure 4.9.4a). The 
angle of rotation, which is measured in the base of the cone, is described as “clockwise” or “counterclockwise” in 
relation to a viewpoint that is along the axis of rotation looking toward the origin. For example, in Figure 4.9.4a the 
vector y results from rotating the vector x counterclockwise around the axis / through an angle g. As in 22, angles are 
positive if they are generated by counterclockwise rotations and negative if they are generated by clockwise rotations. 


- Counterclockwise 
rotation 


(a) Angle of rotation (b) Right-hand rule 


Figure 4.9.4 


The most common way of describing a general axis of rotation is to specify a nonzero vector y that runs along the axis 
of rotation and has its initial point at the origin. The counterclockwise direction for a rotation about the axis can then be 
determined by a “right-hand rule" (Figure 4.9.45): If the thumb of the right hand points in the direction of y, then the 
cupped fingers point in a counterclockwise direction. 


A rotation operator on R? is a matrix operator that rotates each vector in 23 about some rotation axis through a fixed 
angle @. In Table 6 we have described the rotation operators on R? whose axes of rotation are the positive coordinate 
axes. For each of these rotations one of the components is unchanged, and the relationships between the other 
components can be derived by the same procedure used to derive 14. For example, in the rotation about the z-axis, the 
z-components of x and w = T(x) are the same, and the x- and y-components are related as in 14. This yields the rotation 
equation shown in the last row of Table 6. 


Table 6 


Operator Standard Matrix 


C ounterclockwise шу =х 1 0 0 
rotation about ] | А 
the positive x-axis Y | w=ycosð-zsin 6 О суб  -sin6 
through an шз = y sin Ө+ z cos 0 О sind cos 
angle 0 

Counterclockwise =xcos +: 51п Ө cos0 О sing 


rotation about 
the positive y-axis 0 | 0 

through ап y з = х sin 0+ z cos 0 -sin 0 соѕ 0 
angle Ө 


Counterclockwise 
rotation about 

the positive z-axis ' шу = х ѕіп 0+ у cos 0 sinüó cos@ 0 
through an | wa -z 0 0 1 
angle Ө ; 


шу 2Xcos ĝ- y sin 0 cos@ -sin@ 0 


For completeness, we note that the standard matrix for a counterclockwise rotation through an angle 9 about an axis in 
R3, which is determined by an arbitrary unit vector u = (a, b, с) that has its initial point at the origin, is 


а? (1 — cos} +cos# ^ ab(l-—cos8) —csinfü ас(1 = cosh) + bsinf 
ab(1- созӣ) + csin b? (1 — cost) -Fcosü ^ bc(l-— cosh) — asinf (15) 
ac(1l— созӣ) — bsinf — bc(1— cosh) + asinf) c? (1 E cos} + cosh 

The derivation can be found in the book Principles of Interactive Computer Graphics, by W. M. Newman and R. F. 


Sproull (New York: McGraw-Hill, 1979). You may find it instructive to derive the results in Table 6 as special cases of 
this more general result. 


Dilations and Contractions 


If k is a nonnegative scalar, then the operator T(x) = kx on R? or R? has the effect of increasing or decreasing the 
length of each vector by a factor of k. If 0 < & < 1 the operator is called a contraction with factor k, and if > 1 it is 


called a dilation with factor k (Figure 4.9.5). If — 1, then T is the identity operator and can be regarded either as а 
contraction or a dilation. Tables 7 and 8 illustrate these operators. 


PN max 
x pa T(x) = kx 
S j 
"d 
; Tix) = Кз 
ш (x) = kx 2 
« Pd 
(a) О<К<1 (b) k»1 
Figure 4.9.5 
Table 7 
Operator Illustration Effect on the Standard Basis Standard 


T(x, y) — (kx, ky) Matrix 


y 


Contraction with factor k 
оп g? (0 € k <1) 


Dilation with factor К on i» P (kx, ky) 


R? (k > 1) | (х,у) 


ТаЫе 8 
Illustration Standard 
Operator T(x, у, z) = (kx, ky, kz) Matrix 


Contraction with 
factor k on R^ 


(0<К< 1) 


x 


Tix) »- (Kx, ky, kz) 


Dilation with 
factor k on R? 


(k21) 


Yaw, Pitch, and Roll 


In aeronautics and astronautics, the orientation of an aircraft or space shuttle relative to an xyz-coordinate 
system is often described in terms of angles called yaw, pitch, and roll. If, for example, an aircraft is flying 


along the y-axis and the xy-plane defines the horizontal, then the aircraft's angle of rotation about the z-axis is 
called the yaw, its angle of rotation about the x-axis 1s called the pitch, and its angle of rotation about the y-axis 
is called the roll. A combination of yaw, pitch, and roll can be achieved by a single rotation about some axis 
through the origin. This is, in fact, how a space shuttle makes attitude adjustments—it doesn't perform each 
rotation separately; it calculates one axis, and rotates about that axis to get the correct orientation. Such rotation 
maneuvers are used to align an antenna, point the nose toward a celestial object, or position a payload bay for 
docking. 


Expansion and Compressions 


In a dilation or contraction of 22 or 3, all coordinates are multiplied by a factor k. If only one of the coordinates is 


multiplied by &, then the resulting operator is called an expansion or compression with factor k. This is illustrated in 
Table 9 for 22. You should have no trouble extending these results to 23. 


Table 9 


Operator Illustration Effect on the Standard Basis Standard 
T(x, y) = (kx, y) Matrix 

Compression of #2 in the 

x-direction with factor К 


(0€ k« 1) 


Expansion of 22 in the | (х,у) (х,у) 
x-direction with factor k 


(k 2 1) 


(1.0) 


Operator Illustration Effect on the Standard Basis Standard 
T(x, y) — (x, ky) Matrix 


Compression of 22 in the › 


y-direction with factor К 
(0c kl) 


Operator Illustration Effect on the Standard Basis Standard 
T(x, y) = (kx, y) Matrix 


tt 


Expansion of #2 in the (0, k) As 


y-direction with factor k 


(k 


Shears 


A matrix operator of the form T(x, у) = (x + ky, y) translates a point (x, у) in the xy-plane parallel to the x-axis by 
an amount icy that is proportional to the y-coordinate of the point. This operator leaves the points on the x-axis fixed 
(since y = 0), but as we progress away from the x-axis, the translation distance increases. We call this operator the 
shear in the x-direction with factor k. Similarly, a matrix operator of the form Түх, у) = (x, y + kx) is called the 
shear in the y-direction with factor k. Table 10 illustrates the basic information about shears in 22. 


Table 10 


Operator Effect on the Standard Basis Standard 


Matrix 
Shear of g? in the x-direction with —— -— B 
factor k T(x, y) = (x + ky, y) | 


(1,0) 
(К < 0) 


Shear of 22 in the y-direction with 
factor k T(x, y) = (x, y + kx) 


(б > 0) 


EXAMPLE 5 Some Basic Matrix Operators оп R 4 


In each part describe the matrix operator corresponding to А, and show its effect on the unit square. 
12 


z [2 0 [2 0 
вл | Гом ов 2) 


Solution Ву comparing the forms of these matrices to those in Tables 7, 9, and 10, we see that the 
matrix A; corresponds to a shear in the x-direction with factor 2, the matrix Az corresponds to a dilation 
with factor 2, and 445 corresponds to an expansion in the x-direction with factor 2. The effects of these 
operators on the unit square are shown in Figure 4.9.6. 


Figure 4.9.6 


OPTIONAL 
Orthogonal Projections on Lines Through the Origin 


In Table 3 we listed the standard matrices for the orthogonal projections on the coordinate axes in 22. These are special 
cases of the more general operator Т. 22 — 22 that maps each point into its orthogonal projection on a line L through 
the origin that makes an angle 9 with the positive x-axis (Figure 4.9.7). In Example 4 of Section 3.3 we used Formula 10 
of that section to find the orthogonal projections of the standard basis vectors for 22 on that line. Expressed in matrix 


form, we found those projections to be 


2 infi cos 
Tle1|2| ©®* 8 and Tl|e3|— ° P 
sinfcos sin“ 


Figure 4.9.7 


Thus, the standard matrix for T is 


cos ^B  sinficosÓ 
sinÜcosÜ sin" 


— 


| cos? 29020 


Fl =| Те; |127 ез | | = 
lsin20 віп20 


№ 


In keeping with common usage, we will denote this operator by 


249 l; 
Р соз^20  sinficosÓÜ ы 2 un 


sinÜcosÜ sin’ 1sin20 sin? 


We have included two versions of Formula 16 
because both are commonly used. Whereas the first 
version involves only the angle 0, the second 
involves both 0 and 20. 


EXAMPLE 6 Orthogonal Projection on a Line Through the Origin + 


Use Formula 16 to find the orthogonal projection of the vector x — (1, 5) on the line through the origin 
that makes an angle of 7 / 6 | = 30 | with the x-axis. 


Solution Since sin(z / 6) = 1/2 and cos (x / 6) = үз / 2, it follows from 16 that the standard matrix 


for this projection is 


cos? (s 16) sin(a / 6)cos(z / 6) з ¥3 
Pri = =" 
sin(a / 6)cos(m / 6) sin” (x16) B 1 
4 4 
Thus, 
з үз 34- 5/3 
е Б.Е. ВЕ E es] 
i 43. i |5 үз +5 1.68 
4 4 4 


or in comma-delimited notation, P. ;g(1, 5) z (2.91, 1.68) 


Reflections About Lines Through the Origin 


In Table 1 we listed the reflections about the coordinate axes in 22. These are special cases of the more general operator 
Hg: R? — R? that maps each point into its reflection about а line L through the origin that makes an angle Ө with the 


positive x-axis (Figure 4.9.8). We could find the standard matrix for р by finding the images of the standard basis 
vectors, but instead we will take advantage of our work on orthogonal projections by using the Formula 16 for Pg to 
find a formula for Hg. 


Figure 4.9.8 


You should be able to see from Figure 4.9.9 that for every vector x in R” 


Рек-х= 4 (Hox—x] or equivalently Hox = (226-1): 


Figure 4.9.9 


Thus, it follows from Theorem 4.9.2 that 
Hg—2Pg—I (17) 


and hence from 16 that 


m соѕ20 sin2f їй 
sin20 —соѕ20 (18) 
EXAMPLE 7 Reflection About a Line Through the Origin + 
Find the reflection of the vector х = (1, 5) on the line through the origin that makes an angle of n/6(= 30?) 
with the x-axis. 
Solution Since sin (= і 3) = үз į 2 and cos(z / 3) = 1/2, it follows from 18 that the standard matrix 
for this projection is 
| i у 
H cos(r/ 3)  sm(m/3) 2 2 
V6 | sn(r/3) —соз(л/3)| Га 
2 
Thus, 
a 3 1+ 5үз 
Moins 2 2 B 2 | 4.83 
ma B. 3-5 | | -163 
2 2 
or in comma-delimited notation, 2.1601, 5) = (4.83, = 1.63) 
Show that the standard matrices in Tables 1 and 3 
are special cases of 18 and 16. 
m m 


Concept Review 
* Function 


* Image 


* Value 

* Domain 

* Codomain 

* Transformation 

* Relationships among the fundamental spaces 
* Operator 

* Matrix transformation 

* Matrix operator 

* Standard matrix 

* Properties of matrix transformations 
* Zero transformation 

* [dentity operator 

* Reflection operator 

* Projection operator 

* Rotation operator 

* Rotation matrix 

* Rotation equations 

* Axis of rotation in 3-space 
* Angle of rotation in 3-space 
* Expansion operator 

* Compression operator 

* Shear 

* Dilation 


* Contraction 


Skills 
* Find the domain and codomain of a transformation, and determine whether the transformation 15 linear. 


* Find the standard matrix for a matrix transformation. 


° Describe the effect of a matrix operator on the standard basis in R”. 


Exercise Set 4.9 
In Exercises 1—2, find the domain and codomain of the transformation F a(x) = Ах. 


1. (a) A has size 3 x 2. 
(b) A has size 2 x 3. 
(c) A has size 3 x 3. 
(d) A has size | x 6. 


Answer: 


(а) Domain: 22; codomain: R? 
(b) Domain: 27; codomain: 22 
(c) Domain: 27; codomain: R? 


(d) Domain: 2°; codomain: 21 


2. (a) A has size 4 x 5. 
(b) A has size 5 x 4. 
(c) A has size 4 x 4. 
(d) A has size 3 x 1. 


3. If T(x1, x3) = (x1 + x2, = x2, 3x1), then the domain of T is , the codomain of T is , and 
the image of x = (1, — 2) under T is 


Answer: 


R^, ВЗ, (= 1,2, 3) 
4. If T(x1, x3, x3) = (x1 + 2x3, x1 — 2x3), then the domain of T is , the codomain of T is ; 
and the image of x = (0, — 1, 4) under T is 


5. In each part, find the domain and codomain of the transformation defined by the equations, and determine whether 
the transformation is linear. 


(a) wy = 3x1 = 2х3 + 4x3 
w= 5х = 8х3 + X3 

(b) wi—2xix2— — x2 
w= xj +3x 1x2 
w3= х] + x2 

(с) №1 = 5x1 = x3-F x3 
w= —Xxj- х2+Е7хз 
w3 = 25] = 4х) = хз 

(d) w = x2 — 3x2 x3 — 2х4 


w= 3x, —4x3 – х2 + х4 
Answer: 


(a) Linear; R? _, R? 
(b) Nonlinear; R? _, R? 
(c) Linear; R? — R? 
(d) Nonlinear; R4 —, p2 


6. In each part, determine whether T is a matrix transformation. 
(a) T(x, у) = (2x, y) 
(b) T(x, y) 2 (—». x) 
(c) T(x, y) = (2x +y, x =y) 
(d) T(x} = (к>) 


(е) T(x, y) = (х,у +1) 
7. In each part, determine whether T is a matrix transformation. 
(a) T(x, y, z) = (0, 0) 
(b) T(x, y. z) = (1, 1) 
(с) T(x, y, z) = (3x = 4y, 2x = 5z) 


(d) (х, y.) = z) 
(e) T(x, у,2) = y — 1, х) 
Answer: 


(a) and (c) are matrix transformations; (b), (d), and (e) are not matrix transformations. 


8. Find the standard matrix for the transformation defined by the equations. 


(а) wi = 2x, = 3x2 + x4 
w2 = 3x, + 5x2 = х4 
b) wi = Tx, 2х2 = 8x3 
ма = =x + 5х3 
w3 = 4х] 7х) = х3 
(с) #1 = =x + x2 
wa = 3x, = 2x3 
w3 = 5x, = 7x3 
(4) #1 5 ^1 
#2 = Хх] х2 
w3 = X|j-x2-x3 


WA ХІ х2 9х3 х4 


9. Find the standard matrix for the operator T: R? — R? defined by 


wi, = 3x, + 5х2 = х3 
wa = 4%, = x3 + x3 
w3 = 3x, + 2x2 = х3 


and then calculate 7( — 1, 2, 4) by directly substituting in the equations and also by matrix multiplication. 


Answer: 

3 5 =l 

4 —] 1); 7¢=1, 2,4) = (3, -2, = 3) 
3 2 =l 


10. Find the standard matrix for the operator T defined by the formula. 
(а) T(x1, x2) = (2x1 — x2, X1 x2) 
(b) T(x1. х2) = (х1, х2) 
(с) T(x1. x2, x3) = (1 + 2x2 +23, X1 + 5х2, x3) 
(d) T(xi, х2, x3) = (4x1, 7x5, — 8x3) 
11. Find the standard matrix for the transformation Т defined by the formula. 
(а) T(x1, 32) = (12, — X1, x1 + 3x2, x1 — 22) 
(b) T(x1, x2, x3, 14) = (7х1 + Zx2 х3 + X4, х2 x3, — Х|) 


(c) T(x1, х2, x3) = (0, 0, 0, 0, 0) 
(d) 7(х1, х2, 3, 34) = (х4, 21,23, 32, 31 — 3) 


Answer: 

(a) 0 1 
—1 0 
1 3 
1 -1 

(b) 7 2 —1 1 
0 1 1 0 
=1 0 0 0 

(с)|0 0 0 
000 
ооо 
ооо 
ооо 

(а |0 0 0 1 
10 0 0 
0 0 10 
0 1 0 0 
10 =—1 0 


12. In each part, find T(x), and express the answer in matrix form. 


а р] 
(b) -1 
r|=|-; 7 x= 1 
3 
(с) —2 1 4 X] 
Т|=| 35 7|x-|x2 
6 0 =l X3 
(d) =| J 
Т|=| 2 4); х= [ж] 
7 8 


13. In each part, use the standard matrix for T to find T(x); then check the result by calculating 7'(х) directly. 
(а) T(xy, x2) = (7x1 x2, 32; х= (— 1,4) 
(b) T(xi, x2, x3) = (2x1 — x2 x3, X2 +23, 0); x= (2, 1, — 3) 


Answer: 


(a) T(— 1,4) = (5, 4) 
(b) 7(2, 1, — 3) = (0, — 2, 0) 
14. Use matrix multiplication to find the reflection of ( — 1, 2) about 


(a) the x-axis. 


15. 


16. 


17. 


18. 


19. 


(b) the y-axis. 

(c) the line y — x. 

Use matrix multiplication to find the reflection of (2, — 5, 3) about 
(a) the xy-plane. 

(b) the xz-plane. 

(c) the yz-plane. 


Answer: 


(a) (2, —5, —3) 
(b) (2,5,3) 
() (-2, -5,3) 


Use matrix multiplication to find the orthogonal projection of (2, — 5) on 
(a) the x-axis. 

(b) the y-axis. 

Use matrix multiplication to find the orthogonal projection of ( — 2, 1, 3) on 
(a) the xy-plane. 

(b) the xz-plane. 

(c) the yz-plane. 


Answer: 
(a) ( = 2; 1, 0) 
(p) (—2,0,3) 


(с) (0,1,3) 


Use matrix multiplication to find the image of the vector (5, — 4) when it is rotated through ап angle of 
(а) = 30". 

(b a= — 60". 

(c) Ө=45. 

(d) g — 90". 


Use matrix multiplication to find the image of the vector ( — 2, 1, 2) if it is rotated 
(a) 30? about the x-axis. 
(b) 45? about the y-axis. 
(c) 90? about the z-axis. 


Answer: 

(a) - 03-2  1-42y3 
Е 2.-* 2 

(b) (0. 1, 2/2) 


(c) (= 1, 72,2) 


20. Find the standard matrix for the operator that rotates a vector in д? through an angle of —60 about 


21. 


22. 


23. 
24. 


25. 


26. 


(a) the x-axis. 

(b) the y-axis. 

(c) the z-axis. 

Use matrix multiplication to find the image of the vector ( — 2, 1, 2) if it is rotated 
(a) —30° about the x-axis. 

(b) —45° about the y-axis. 


(c) —90° about the z-axis. 


Answer: 


(a) _) iR 2 —142y3 
NE E 2 
(b) (72/2. 1,0) 


(c) (1, 2, 2) 
In 3 the orthogonal projections on the x-axis, y-axis, and z-axis are defined by 
Ti(x,y,z) = (x, 0, 0), Тә(х, у,2) = (0, у, 9), 
Ta(x,y,z) = (0, 0, 2) 
respectively. 


(a) Show that the orthogonal projections on the coordinate axes are matrix operators, and find their standard 
matrices. 


(b) Show that if F: R? — R? is an orthogonal projection on one of the coordinate axes, then for every vector x in 22 
, the vectors T(x) and x — T(x) are orthogonal. 


(c) Make a sketch showing x and x — T(x) in the case where Т is the orthogonal projection on the x-axis. 
Use Formula 15 to derive the standard matrices for the rotations about the x-axis, y-axis, and z-axis in 27. 


Use Formula 15 to find the standard matrix for a rotation of x / 2 radians about the axis determined by the vector 
v — (1, 1, 1). [Note: Formula 15 requires that the vector defining the axis of rotation have length 1.] 


Use Formula 15 to find the standard matrix for a rotation of 180? about the axis determined by the vector 
v — (2, 2, 1). [Note: Formula 15 requires that the vector defining the axis of rotation have length 1.] 


Answer: 

л лета 
9 9 9 
LS -4 
9 9 9 
A 4 7 
9 9 9 


It can be proved that if A is a 2 x 2 matrix with orthonormal column vectors and for which det(.4) = 1, then 
multiplication by A is a rotation through some angle g. Verify that 


satisfies the stated conditions and find the angle of rotation. 


27. The result stated in Exercise 26 can be extended to 2; that is, it can be proved that if A is a 3 x 3 matrix with 
orthonormal column vectors and for which еї 4) = 1, then multiplication by A is a rotation about some axis 
through some angle 9. Use Formula 15 to show that the angle of rotation satisfies the equation 

(Ау —1 
à 
28. Let A bea 3 x 3 matrix (other than the identity matrix) satisfying the conditions stated in Exercise 27. It can be 


RU „з з Т. Р . 
shown that if x is апу nonzero vector in R3, then the vector u = Ax + A x 4 L = tr(4} fx determines an axis of 


cos = 


rotation when y is positioned with its initial point at the origin. [See “The Axis of Rotation: Analysis, Algebra, 
Geometry,” by Dan Kalman, Mathematics Magazine, Vol. 62, No. 4, October 1989.] 


(a) Show that multiplication by 


l 
O| jo чое 
wj wl ole 


wo «o sojo 


is a rotation. 
(b) Find a vector of length 1 that defines an axis for the rotation. 
(c) Use the result in Exercise 27 to find the angle of rotation about the axis obtained in part (b). 


29. [n words, describe the geometric effect of multiplying a vector x by the matrix А. 


(a). 1198 
^-| o] 

(b 4, |2 0 
^|; | 

Answer: 


(a) Twice the orthogonal projection on the x-axis. 


(b) Twice the reflection about the x-axis. 


30. In words, describe the geometric effect of multiplying a vector x by the matrix А. 


(a) , |2 0 
4-[5 3] 
€ [fs 4 

"NES 
А= 

i үз 

5-5 


31. In words, describe the geometric effect of multiplying a vector x by the matrix 


_ соз20—ш20  —2 sin Ø cos 0 


A 2 2 
2snÜ cosÜ cos*#—sin“é 


Answer: 


Rotation through the angle 29. 
32. If multiplication by A rotates a vector x in the xy-plane through an angle Ө, what is the effect of multiplying x by 47 
? Explain your reasoning. 


33. Let xg be a nonzero column vector in 22, and suppose that т. 52 — 22 is the transformation defined by the formula 
T(x) = xg + gx, where Ар is the standard matrix of the rotation of R? about the origin through the angle 0. Give a 


geometric description of this transformation. Is it a matrix transformation? Explain. 
Answer: 


Rotation through the angle 0 and translation by X9; not a matrix transformation since Xg is nonzero. 


34. A function of the form f (x) = mx + b is commonly called a “linear function" because the graph of y = эу + b is 


a line. Is fa matrix transformation on R? 


35. Let x = xg + tv be a line in R”, and let T: R” — R” be a matrix operator on R”. What kind of geometric object is 
the image of this line under the operator 7? Explain your reasoning. 


Answer: 

A line in 2”. 
True-False Exercises 
In parts (a)-(1) determine whether the statement is true or false, and justify your answer. 
(a) If A is a 2 x 3 matrix, then the domain of the transformation Т д is д2. 


Answer: 


False 


(b) If A is an jj; x 4 matrix, then the codomain of the transformation 7 4 is RP. 
Answer: 


False 
(c) If T:R” — R" and T(0) = 0, then 7 is a matrix transformation. 


Answer: 


False 


(d) If T:R” — R” and Т(сух + сзу) = e1T(x) +27 (y) for all scalars сі and сз and all vectors x and y in 5”, then 
Tis a matrix transformation. 


Answer: 


True 


(e) There is only one matrix transformation T: R”? — R" such that T( — x) = — T(x) for every vector x in А”. 


Answer: 


False 


(f) There is only one matrix transformation F: R” — R™ such that T(x + y) = T(x — y) for all vectors x and ¥ in д”. 
Answer: 


True 


(g) If is a nonzero vector in R”, then T(x) = x + b is a matrix operator on R”. 
Answer: 


False 


(h) 


The matrix is the standard matrix for a rotation. 


Mle pole 
role pole 


Answer: 


False 


@ The standard matrices of the reflections about the coordinate axes in 2-space have the form k И | where 
а= +1. 
Answer: 


True 


Copyright (O 2010 John Wiley & Sons, Inc. All rights reserved. 


4.10 Properties of Matrix Transformations 


In this section we will discuss properties of matrix transformations. We will show, for example, that if several 
matrix transformations are performed in succession, then the same result can be obtained by a single matrix 
transformation that is chosen appropriately. We will also explore the relationship between the invertibility of a 
matrix and properties of the corresponding transformation. 


Compositions of Matrix Transformations 


Suppose that 7 4 is a matrix transformation from 2” to RÝ and T'gis a matrix transformation from RE to R™. If x 
is a vector in R”, then Т д maps this vector into a vector Т 4(x) in R*, and Tp, in turn, maps that vector into the 
vector Z'g(T 4(x)) in R”. This process creates a transformation from R" to R" that we call the composition of 
T g with T д and denote by the symbol 

TgoTA 
which is read “7р circle Т 4". As illustrated in Figure 4.10.1, the transformation 7 4 in the formula is performed 
first; that is, 


(Tgo T (x) = Tg(T д(х)) (1) 


This composition is itself a matrix transformation since 
(Tgo T (x) = Tg(T 4(x)) = B(T 4) = BCAx) = (ВА)х 
which shows that it is multiplication by £4. This is expressed by the formula 


TpoTA-—TgA (2) 


WARNING 


Just as it is not true, in general, that 
AB — BA 
so It is not true, in general, that 
TgoTA-—TA4oTg 
That is, order matters when matrix 
transformations are composed. 


Tg TAX) 


Ж 
м 


Tg ^ T, 


Figure 4.10.1 


Compositions can be defined for any finite succession of matrix transformations whose domains and ranges have 


the appropriate dimensions. For example, to extend Formula 2 to three factors, consider the matrix 
transformations 


ТАЕ". RF, Tg. RF 5 RI, To: R! o, В" 
We define the composition (Tc o Тво T na — R” by 
(Гоо Тво? д) (х) = Te(Tg(T д(х))) 


As above, it can be shown that this is a matrix transformation whose standard matrix is ВА and that 


Тоо Тро Т д= Торд (3) 


As in Formula 9 of Section 4.9 , we can use square brackets to denote a matrix transformation without 
referencing a specific matrix. Thus, for example, the formula 


[720 71] = [72][71] (4) 


is a restatement of Formula 2 which states that the standard matrix for a composition is the product of the 
standard matrices in the appropriate order. Similarly, 


[T30 T20 71] = [73] [72] [71] (5) 
Is a restatement of Formula 3. 


EXAMPLE 1 Composition of Two Rotations + 


Let T, R? — р? and Т» R? _, р? be the matrix operators that rotate vectors through the angles #1 
and 03, respectively. Thus the operation 

(T20 T1) (x) = T2(T109) 
first rotates x through the angle 81, then rotates 7, (x) through the angle 85. It follows that the net 
effect of T3 o T1 is to rotate each vector in R? through the angle 81 + 85 (Figure 4.10.2). Thus, the 


standard matrices for these matrix operators are 
созбу —sinfy cosfy —ззш#9» 
[71] = ‚ [72| = |. 
sif, созӣ! sinf^ созӣ 
cos(#; +02) —ш(#| +82) 
sin(B1-- 05) cos(#; +02) 


nen- 


These matrices should satisfy 4. With the help of some basic trigonometric identities, we can 
confirm that this is so as follows: 


To) (Tf = 
[72] TS sif, cosh || sinfy cosh; 


р sed a a 
fee = sinfzsinf; — (cos sin; + wey 
[ 


sinfycosf; + cosfzsinf; —sinfgsinfy + cosfycosh; 


cos(#; +02) —sin(@, + 03) 
sin(#; +62) соѕ(0у + 05) 


Figure 4.10.2 


EXAMPLE 2 Composition Is Not Commutative — 


Let T, R? — R? be the reflection about the line y = x, and let T5: R? — R? be the orthogonal 
projection on the y-axis. Figure 4.10.3 illustrates graphically that 74 o 75 and 75 o T have 
different effects on a vector x. This same conclusion can be reached by showing that the standard 
matrices for 7, and F3 do not commute: 


ЕЕН ЕЮ 
mora -[n|n]-[ [б ]-[ 9 


so [7520 71] € [710 75]. 


у 


TAT (x)) 


Tx) 


Т9 Т,| T; * T; 


Figure 4.10.3 


EXAMPLE 3 Composition of Two Reflections + 


Let T,: А? — д? be the reflection about the y-axis, and let T5: R? — R? be the reflection about the 
x-axis. In this case Г o Тз and T5 o T are the same; both map every vector x = (x, y) into its 
negative 2x = ( — x, — y) (Figure 4.10.4): 

(Tio 72)(х,у) —Tiix, =у) =(=х, =y) 

(T20 Т) (х,у) =T= x,y) =(-%, =y) 


The equality of 74 o Тз and Тз o Т can also be deduced by showing that the standard matrices for 
T and 75 commute: 


-10|1 0 -] 0 
[T1072] = ЕВЕ = | 0 jJ kl = | 0 | 
1 0—10 -1 0 
ОЕ 
The operator T(x) = — x оп д2 ог R? is called the reflection about the origin. As the foregoing 
computations show, the standard matrix for this operator on 22 is 


"|-[% 4 


(х, у) (x.y) сє —————-—————— (х. у) 


T (TAX) 
TAT (x) 


Тү » T; Т • T, 


Figure 4.10.4 


EXAMPLE 4 Composition of Three Transformations + 
Find the standard matrix for the operator T- R? _, R? that first rotates a vector counterclockwise 
about the z-axis through an angle 0, then reflects the resulting vector about the VZ-plane, and then 


projects that vector orthogonally onto the X»-plane. 


Solution The operator T can be expressed as the composition 


T=Tz0Tz0T 
where 7 is the rotation about the z-axis, T3 is the reflection about the yz-plane, and 75 is the 


orthogonal projection on the xy-plane. From Tables 6, 2, and 4 of Section 4.9 , the standard 
matrices for these operators are 


cos =—sinf 0 -1 0 0 100 
Тү|=|ш# cos8 O|, |72|2| 010], |T3/=]0 10 
0 0 1 00 1 000 


Thus, it follows from 5 that the standard matrix for T is 


10 0||—1 0 O||cos8 —sné 0 
0 1 0 0 0||$ш# cos# 0 
000 0 1 0 0 1 


[7] 


sn@ cos 


1 
0 
=—cos# 9160 0 
0 
0 0 0 


One-to-One Matrix Transformations 


Our next objective is to establish a link between the invertibility of a matrix A and properties of the 
corresponding matrix transformation T 4. 


DEFINITION 1 


A matrix transformation F 4; R” — R” is said to be one-to-one if T 4 maps distinct vectors (points) in R” 
into distinct vectors (points) in R™. 


(See Figure 4.10.5). This idea can be expressed in various ways. For example, you should be able to see that the 
following are just restatements of Definition 1: 


1. Т 4is one-to-one if for each vector b in the range of A there is exactly one vector x in R” such that T 4x = b. 


2. T gis one-to-one if the equality 7 afu) = T afv) implies that y = v. 


R" R" R" R" 


One-to-one Not one-to-one 


Figure 4.10.5 


Rotation operators on 22 are one-to-one since distinct vectors that are rotated through the same angle have 
distinct images (Figure 4.10.6). In contrast, the orthogonal projection of R? on the xy-plane is not one-to-one 
because it maps distinct points on the same vertical line into the same point (Figure 4.10.7). 


Figure 4.10.7 The distinct points P and О are mapped into the same point M 


The following theorem establishes a fundamental relationship between the invertibility of a matrix and properties 
of the corresponding matrix transformation. 


THEOREM 4.10.1 


If A is an ух y matrix and Т Fi R” — R” is the corresponding matrix operator, then the following 
statements are equivalent. 


(a) A is invertible. 
(b) The range of F gis А”. 


(c) T 4is one-to-one. 


Proof We will establish the chain of implications (a) => (b) = (c) => (a). 


(a) = (b) Assume that А is invertible. By parts (a) and (e) of Theorem 4.8.10, the system Ах = h is consistent 
for every » x ] matrix h in А". This implies that 7 4 maps x into the arbitrary vector in А”, which in turn 
implies that the range of F 4 is all of R”. 


(b) = (c) Assume that the range of T діѕ R”. This implies that for every vector h in R" there is some vector x 
in R” for which T a(x) = В and hence that the linear system Ах = } is consistent for every vector in E". But 
the equivalence of parts (е) and (7) of Theorem 4.8.10 implies that Ах = h has a unique solution for every vector 


p in д" and hence that for every vector h in the range of 7 д there is exactly one vector x in R” such that 


T 4x =. 


(с) =» (a) Assume that F 4is one-to-one. Thus, if h is a vector in the range of F 4 there is a unique vector x in 
R” for which T a(x) = b. We leave it for you to complete the proof using Exercise 30. 


EXAMPLE 5 Properties of a Rotation Operator <4 


As indicated in Figure 4.10.6, the operator T: R” — R” that rotates vectors in 22 through an angle 
is one-to-one. Confirm that [77] is invertible in accordance with Theorem 4.10.1. 


Solution From Table 5 of Section 4.9 the standard matrix for T is 
0 —sn6 
т|— соз 
| | | sin cos 


cosh —siné 
sn cosh 


This matrix is invertible because 


d7] Е 


= cos^f =} sin? =10 


EXAMPLE 6 Properties of a Projection Operator + 


As indicated in Figure 4.10.7, the operator T: R” — R” that projects each vector in R? 


orthogonally on the xy-plane is not one-to-one. Confirm that [77] is not invertible in accordance 
with Theorem 4.10.1. 


Solution From Table 4 of Section 4.9 the standard matrix for T is 
100 

T|=|0 1 0 

000 


This matrix is not invertible since det[ 7] = 0. 


Inverse of a One-to-One Matrix Operator 


IfT A R” — R” is a one-to-one matrix operator, then it follows from Theorem 4.10.1 that A is invertible. The 
matrix operator 


TQ R” —› R” 


that corresponds to 47! is called the inverse operator or (more simply) the inverse of T д. This terminology is 
appropriate because Т дапа T 4 cancel the effect of each other in the sense that if x is any vector in £”, then 


А 
T ,-1(Ta(x)) ATA = ж = х 
or, equivalently, 
TaoT 4 = A417 TI 
T 3oTA —-T,24-77I 


From а more geometric viewpoint, if w is the image of x under T д then T 471 maps w back into x, since 
Т а (w) = TA (T A(x)) =х 
(Figure 4.10.8). 


Figure 4.10.8 


Before considering examples, it will be helpful to touch on some notational matters. If 7 A R". ,.g"isa 
one-to-one matrix operator, and if 7 Ao ; А” — R” is its inverse, then the standard matrices for these operators 
are related by the equation 
T =T] (6) 
А А 


In cases where it is preferable not to assign a name to the matrix, we will write this equation as 


[7^ |- 77 (7) 


EXAMPLE 7 Standard Matrix for T! < 


Let T: #2 — д2 be the operator that rotates each vector in 22 through the angle g, so from Table 5 


of Section 4.9 , 
9 —sn6 
TI cos 
| | pe a (8) 


It is evident geometrically that to undo the effect of T, one must rotate each vector іп 22 through 
the angle —9. But this is exactly what the operator 77! does, since the standard matrix for 7 —! is 


ka = cos@ №60 | _ соз (— 0) —ш(—#) 
Е  |-sin8 cos@| | sin(—8) cos (0) 


(verify), which is the standard matrix for a rotation through the angle —@. 


EXAMPLE 8 Finding T | ^ 


Show that the operator T: #2 — 52 defined by the equations 
Wy -—2x,-x3 
#2 = 3х + 4х2 


is one-to-one, and find Т^ bn. vi) 


Solution The matrix form of these equations is 


PIG S 
Maes 


This matrix is invertible (so T is one-to-one) and the standard matrix for 7—! is 


so the standard matrix for T is 


a zi 
— = 5 5 
Т |= 71 
| EI LS 5 
ш 5 
Thus 
4 _1 LERNEN | 
0] ЫН aua, 
wil waj 


from which we conclude that 


Linearity Properties 


Up to now we have focused exclusively on matrix transformations from R” to R™. However, these are not the 
only kinds of transformations from 2” to д". For example, if f 1, 2, -.., J are any functions of the n 
variables x1, X2, ..., Ху, then the equations 

wi —JiQ1 2,2 Хм) 

w2 -—Jf201. x2... Хм) 


Wim = f m(X1, X2... Xn) 
define a transformation T: R” — R™ that maps the vector x = (x1, X2, ..., Ху) into the vector (wy, W2, -... М). 
But it is only in the case where these equations are linear that T is a matrix transformation. The question that we 


will now consider is this: 


Question 


Are there algebraic properties of a transformation T: R” — 2™ that can be used to determine whether T is 
a matrix transformation? 


The answer is provided by the following theorem. 


THEOREM 4.10.2 


Т.В” — R" is a matrix transformation if and only if the following relationships hold for all vectors ц 
and y in R” and for every scalar k: 


(i) T(u-- v) = T((u) + T(v)) [Additivity property] 
(ii) 7 (ku) = XT(u) [Homogeneity property | 


Proof If Tis a matrix transformation, then properties (i) and (ii) follow respectively from parts (c) and (b) of 
Theorem 4.9.1. 


Conversely, assume that properties (1) and (ii) hold. We must show that there exists an jj; x »; matrix A such that 
T(x) = Ах 

for every vector x in R”. As a first step, recall from Formula (10) of Section 4.9 that the additivity and 

homogeneity properties imply that 


Tikiu + 202 + + pay) = ky PCy) + kg (uz) + + + + + EL Tuy) (9) 


for all scalars #1, £3, ..., & and all vectors uy, uz, ..., uy in E". Let A be the matrix 
A= [7(e1)|7(e2)|- * * |7(e)] 


in which е1, ез, ..., е„ are the standard basis vectors for R”. 


It follows from Theorem 1.3.1 that 4x is a linear combination of the columns of A in which the successive 
coefficients are the entries х ү, x2, ..., Ху of x. That is, 


Ax — x1T(e1) + x27(e2) + + + х7 (ем) 
Using 9 we can rewrite this as 
Ax = 7T(xje, +x + * * * + X4ey) = Tx) 


which completes the proof. 


The additivity and homogeneity properties in Theorem 4.10.2 are called linearity conditions, and a 
transformation that satisfies these conditions is called a linear transformation. Using this terminology Theorem 


4.10.2 can be restated as follows. 


THEOREM 4.10.3 


Every linear transformation from R" to R™ is a matrix transformation, and conversely, every matrix 
transformation from R" to R™ is a linear transformation. 


More on the Equivalence Theorem 


As our final result in this section, we will add parts (5) and (c) of Theorem 4.10.1 to Theorem 4.8.10. 


THEOREM 4.10.4 Equivalent Statements 


If A is an » x y matrix, then the following statements are equivalent. 
(а) Ais invertible. 

(b) Ax —Q has only the trivial solution. 

(c) The reduced row echelon form of A is 7,,. 

(d) A is expressible as a product of elementary matrices. 

(e) Ах = is consistent for every » x 1 matrix b. 

(f Ах = Ъ has exactly one solution for every » x | matrix p. 
(0) det(A) #0. 

(h) The column vectors of A are linearly independent. 

(i) The row vectors of A are linearly independent. 

(j The column vectors of А span R”. 

(k) The row vectors of A span R”. 

(l) The column vectors of A form a basis for R”. 

(m) The row vectors of А form a basis for R". 

(n) A has rank n. 

(o) A has nullity 0). 

(p) The orthogonal complement of the null space of A is R”. 
(4) The orthogonal complement of the row space of A is {0}. 
(r) The range of F gis R”. 


(s) T gis one-to-one. 


Concept Review 

* Composition of matrix transformations 

* Reflection about the origin 

* One-to-one transformation 

* [nverse of a matrix operator 

* Linearity conditions 

* Linear transformation 

* Equivalent characterizations of invertible matrices 

Skills 

* Find the standard matrix for a composition of matrix transformations. 
* Determine whether a matrix operator is one-to-one; if it is, then find the inverse operator. 


* Determine whether a transformation is a linear transformation. 


Exercise Set 4.10 


In Exercises 1-2, let 7 дапа Тр be the operators whose standard matrices are given. Find the standard matrices 
for and T 4o Тр. 


1. 1—2 0 2 =3 3 
A=|4 1 =3 |, B2|5 0 1 
5 2 4 6 17 
Answer 
5 —1 21 -8 —3 1 
ТвоТд= |10 -8 4|, ТлоТр= | –5 —15 —8 
45 3 25 44 —11 45 
2 6 3 =l 0 4 
A=|2 0 1), B=] =1 J 2 
4 =3 6 -3 8 


3. Let Ti (x1, x3) = (x1 + x2, x4 = x2) and Ta (x1, x3) = (3x1, 2x1 + 4x2). 
(a) Find the standard matrices for 74 and 75. 
(b) Find the standard matrices for 75 o Ty and T, o Т3. 
(c) Use the matrices obtained in part (b) to find formulas for T4 (75(x1, x3)) and T3(T4(x1, x3)) . 


Answer: 


on-[ Jagt 


(b) not=; | nen-[ Ң 


(с) 72(Тү(х1, x2)) = (3x1 + 3х2, 6x1 = 2х2), 
Ту(79(х1, х2)) = (5x1 + 4х2, x1 — 4х2) 


4. Let T1(x1, x2, х3) = (4x1, = 2x1 + X2, = x1 — 3x3) and T3(x4, x3, x3) = (х + 2x9, —x3, 4x1 = x3). 
(a) Find the standard matrices for 7, and 7. 
(b) Find the standard matrices for 75 o Тү and Ту o Т3. 


(c) Use the matrices obtained in part (b) to find formulas for 74 (T35(x1, х2, x3)) and T3(T1(x1, x3, x3)). 
5. Find the standard matrix for the stated composition in 22. 

(a) A rotation of 90°, followed by a reflection about the line y = x. 

(b) An orthogonal projection on the y-axis, followed by a contraction with factor k = 1. 


(c) A reflection about the x-axis, followed by a dilation with factor ip = 3. 


Answer: 
(а)|1 9 
0 —1 

(b 10 0 
o 1 

2 
(с)|3 9 
0 —3 


6. Find the standard matrix for the stated composition in 22. 


(a) A rotation of 60°, followed by an orthogonal projection on the x-axis, followed by a reflection about the 
line y — x. 


(b) A dilation with factor i = 2, followed by a rotation of 45°, followed by a reflection about the y-axis. 
(c) A rotation of 15°, followed by a rotation of 105°, followed by a rotation of 60°. 
7. Find the standard matrix for the stated composition in 23. 


(a) A reflection about the yz-plane, followed by an orthogonal projection on the xz-plane. 
(b) A rotation of 45? about the y-axis, followed by a dilation with factor & = y2- 


(c) An orthogonal projection on the xy-plane, followed by a reflection about the yz-plane. 


Answer: 


(а)|—1 0 0 
000 
00 1 


[1 
0 
1 
(с) |—1 
0 
0 


8. Find the standard matrix for the stated composition in 23. 
(a) A rotation of 30? about the x-axis, followed by a rotation of 30? about the z-axis, followed by a 


contraction with factor & = i 


(b) A reflection about the xy-plane, followed by a reflection about the xz-plane, followed by an orthogonal 
projection on the yz-plane. 


(c) A rotation of 270? about the x-axis, followed by a rotation of 90? about the y-axis, followed by a rotation 
of 180? about the z-axis. 


9. Determine whether T, o 75 = T5 o T1. 
(а) Ty: R? — д? is the orthogonal projection on the x-axis, and Тэ: R? — д? is the orthogonal projection on 
the y-axis. 


(b) Ti: R? — д? is the rotation through an angle 81, and T3: R? _, р? is the rotation through an angle 3. 


(с) Тү: R? — д? is the orthogonal projection on the x-axis, and Ta: R? — R? is the rotation through an angle 
g. 


Answer: 


(a) ТоТ = Тэо Т 

(b) ТоТ = ТоТ 

(с) T10 72€ T20 Ti 

10. Determine whether 71 o 75 = 73 o T1. 

(a) Ti: R? _, R? is a dilation by a factor k, and Ta: R? — R? is the rotation about the z-axis through an angle 
0. 

(6) Ti R? — R? is the rotation about the x-axis through an angle 8, and Тэ: R? — R? is the rotation about 
the z-axis through an angle 85. 


11. By inspection, determine whether the matrix operator is one-to-one. 
(a) the orthogonal projection on the x-axis in 22 
(b) the reflection about the y-axis in 22 
(c) the reflection about the line y = x in R? 
(d) a contraction with factor ic = () in 22 
(e) a rotation about the z-axis in g? 
(f) a reflection about the xy-plane in 23 
(g) a dilation with factor ic = () in д? 


12. 


13. 


Answer: 


(a) Not one-to-one 
(b) One-to-one 
(c) One-to-one 
(d) One-to-one 
(e) One-to-one 
(f) One-to-one 
(g) One-to-one 


Find the standard matrix for the matrix operator defined by the equations, and use Theorem 4.10.4 to 
determine whether the operator is one-to-one. 


(а) w1 = 8x1 4x3 
W2-—2x,-4 x2 
(b) w1— 2x1 = 3х2 
мо = 5х1 + x3 
(с) W1 = = х + 3х2 + 2x3 
#2 = 2x1 + 4х3 
w3 = x1 + 3х2 + 6x3 
(d) w1 = х] + 2x2 + 3x3 
м2 = 2x1 + 5х2 + 3x3 
w3= Х| + 8x3 
Determine whether the matrix operator Т. R? — 22 defined by the equations is one-to-one; if so, find the 
standard matrix for the inverse operator, and find p v 1, w2). 
(a) #1 = х + 2х2 
w= =x + x2 
(b) W1 = 4х — 6х2 


ма = = 2x1 + 3x2 
Cid ae 
w= =x 
(d wi = 3x1 
w= = 5x, 
Answer: 
(a) 1 _2 
|? 3j po el ы ИР Breer ae 
One-to-one; 1 1F T (wj, w) = (i 32, 201 | 32) 
3 3 


(b) Not one-to-one 
—1 


(с) One-to-one; [ | j| Tw, м2) = (= 02, = №1) 


(d) Not one-to-one 


14. Determine whether the matrix operator T: R? — R? defined by the equations is one-to-one; if so, find the 
standard matrix for the inverse operator, and find T v 1, —2, vi) 
(а) W1 = х] = 2х2 + 2х3 
м2 = 2х1 + х2 + X3 


з= Хх] X2 

(b) W1 = х = 3х2 + 4х3 
тле = х] + X24 X3 
w3 = — 2х2 + 5х3 


(с) №1 = х] + 4x; = х3 
мо = 2х1 + 7х2 + x3 
з= ху + 3x2 


(d wi = х] +2х2+ х3 
w= = 2х1 + х2 + 4х3 
w3 = 7x1 4х2 = 5х3 


15. By inspection, find the inverse of the given one-to-one matrix operator. 
(a) The reflection about the x-axis in 22. 
(b) The rotation through an angle of т / 4 in 27. 
(c) The dilation by a factor of 3 in 27. 
(d) The reflection about the yz-plane in 23. 


(€) The contraction by a factor of i in 23. 


Answer: 


(a) Reflection about the x-axis 


(b) Rotation through the angle — 


4 


(C) Contraction by a factor of i 


(d) Reflection about the yz-plane 
(e) Dilation by a factor of 5 


In Exercises 16—17, use Theorem 4.10.2 to determine whether 7: 22 — 22 is a matrix operator. 
16. (s) T(x, y) = (2х, у) 

(b) T(x. = (к>) 

(c) T(x, y) = ( —-». x) 

(d) T(x, y) = (x, 0) 
17. (a) T(x, у) = Qx y, x =y) 

(b) T(x, y) = (+1, у) 


(с) Tix, у) = (у, у) 


өто) (9) 


Answer: 


(a) Matrix operator 
(b) Not a matrix operator 
(c) Matrix operator 


(d) Not a matrix operator 


In Exercises 18-19, use Theorem 4.10.2 to determine whether Т. R? _, 22 is a matrix transformation. 


18. (a) T(x, y, z) = (x, x +y +2) 
(b) T(x, у,2) = (1, 1) 


19. (a) T(x, y,z) = (0, 0) 
(b) T(x, y, z) = (3x — 4y, 2x = 5z) 


Answer: 


(a) Matrix transformation 


(b) Matrix transformation 


20. In each part, use Theorem 4.10.3 to find the standard matrix for the matrix operator from the images of the 
standard basis vectors. 


(a) The reflection operators on 2 in Table 1 of Section 4.9 . 
(b) The reflection operators on 2? in Table 2 of Section 4.9 . 
(c) The projection operators on 22 in Table 3 of Section 4.9 . 
(d) The projection operators on g? in Table 4 of Section 4.9 . 
(e) The rotation operators on 22 in Table 5 of Section 4.9 . 


(f) The dilation and contraction operators on g? in Table 8 of Section 4.9 . 


21. Find the standard matrix for the given matrix operator. 


(а) T-R? — R? projects a vector orthogonally onto the x-axis and then reflects that vector about the y-axis. 
(b) T: R2 — р? reflects a vector about the line у = x and then reflects that vector about the x-axis. 


(с) T- д2 — R? dilates a vector by a factor of 3, then reflects that vector about the line y = x, and then 
projects that vector orthogonally onto the y-axis. 


Answer: 


22. 


23. 


24. 


25. 


| 01 
—1 0 
(с) |0 0 
з 0 
Find the standard matrix for the given matrix operator. 


(a) T:R? — R? reflects a vector about the xz-plane and then contracts that vector by a factor of i. 


(b) T- R? — R? projects a vector orthogonally onto the xz-plane and then projects that vector orthogonally 


onto the xy-plane. 


(с) T-R? — R? reflects a vector about the xy-plane, then reflects that vector about the xz-plane, and then 


reflects that vector about the yz-plane. 


Let F R? — R? be multiplication by 
-1 3 0 
А=| 21 2 
45 -3 


and let e1, ез, and ез be the standard basis vectors for 25. Find the following vectors by inspection. 
(a) T 4(e1), T д(ез), and T д(ез) 

(b) ТА(ё1 + ez + ex) 

(c) T4Cexa) 


Answer: 


(а) T4(e1) = (— 1, 2, 4), Tale) = (3, 1, 5), T 4(e3) = (0, 2, – 3) 
(b) Tale + ез + ез) = (2, 5, 6) 
(с) ГА(?ез) = (0, 14, – 21) 


Determine whether multiplication by А is a one-to-one matrix transformation. 
(a) 1 =1 
А=|2 0 
3 -—4 
(b) A= Lou 3 
—1 0 —4 
(с) ^ EE! 
0 1 1 
А= 
сй 
10 =1 


(a) Is a composition of one-to-one matrix transformations one-to-one? Justify your conclusion. 


(b) Can the composition of a one-to-one matrix transformation and a matrix transformation that is not 
one-to-one be one-to-one? Account for both possible orders of composition and justify your conclusion. 


Answer: 


(a) Yes 
(b) Yes 


26. Show that T(x, у) = (0, 0) defines a matrix operator on R? but T(x, y) = (1, 1) does not. 


27. (a) Prove: If T: R” — Д" is a matrix transformation, then T(0) = 0; that is, T maps the zero vector in R” 
into the zero vector in &"". 


(b) The converse of this is not true. Find an example of a function that satisfies 7(0) = 0 but is not a matrix 
transformation. 


Answer: 
(b) T(xi, х2) = (x? | x. xix) 


28. Prove: An у x » matrix A is invertible if and only if the linear system 4x = w has exactly one solution for 
every vector y іп R" for which the system is consistent. 


29. Let A be an » x » matrix such that det(A) = 0, and let 7: R” — R” be multiplication by A. 
(a) What can you say about the range of the matrix 7? Give an example that illustrates your conclusion. 


(b) What can you say about the number of vectors that 7 maps into Q? 
Answer: 


(a) The range of T is a proper subset of R”. 


(b) T must map infinitely many vectors to 0. 
30. Prove: If the matrix transformation 7 4; R” — R” is one-to-one, then А is invertible. 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 
(a) If T: R” — R™ and T(0) = 0, then T is a matrix transformation. 
Answer: 


False 


(b) If T:R” — R” and Т(сух + cay) — e1T(x) +27 (y) for all scalars с] and сз and all vectors x and y in А” 
, then T is a matrix transformation. 


Answer: 


True 


(c) If T: R” — R" is a one-to-one matrix transformation, then there are no distinct vectors x and y for which 


T(x — y) =0. 
Answer: 


True 


(d) If T: R” — R” is a matrix transformation and jj; > з, then T is one-to-one. 
Answer: 


False 


(e) If T: R” — R” is a matrix transformation and з; — д, then T is one-to-one. 
Answer: 


False 


(f) If T: R” — Д" is a matrix transformation and 55 < з, then T is one-to-one. 
Answer: 


False 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


4.11 Geometry of Matrix Operators on R? 


In this optional section we will discuss matrix operators оп 22 in a little more depth. The ideas that we will develop here 
have important applications to computer graphics. 


Transformations of Regions 


In Section 4.9 we focused on the effect that a matrix operator has on individual vectors in 22 and 53. However, it is also 


important to understand how such operators affect the shapes of regions. For example, Figure 4.11.1 shows a famous 
picture of Albert Einstein and three computer-generated modifications of that image that result from matrix operators on 
R2. The original picture was scanned and then digitized to decompose it into a rectangular array of pixels. The pixels 


were then transformed as follows: 
* The program MATLAB was used to assign coordinates and a gray level to each pixel. 
* The coordinates of the pixels were transformed by matrix multiplication. 


* The pixels were then assigned their original gray levels to produce the transformed picture. 


Rotated | Sheared horizontally Compressed horizc 


] 


Figure 4.11.1 


The overall effect of a matrix operator on 22 can often be ascertained by graphing the images of the vertices 

(0, 0), (1,0), (0, 1), and (1, 1) of the unit square (Figure 4.11.2). Table 1 shows the effect that some of the matrix 
operators studied in Section 4.9 have on the unit square. For clarity, we have shaded a portion of the original square and 
its corresponding image. 


е! 


Unit square Unit square rotated Unit square reflected 
| — about the y-axis 


| Unit square 
| onto the x-i 


Unit square reflected 
about the line y = х 


Figure 4.11.2 


Table 1 


Standard Matrix Effect on the Unit Square 


Reflection about 
the y-axis 


Reflection about 
the x-axis 


Reflection about 
the line y =x 


Counterclockwise 
rotation through 
an angle 0 


Compression in the 
x-direction by a 
factor of k 


(0<k<1) 


Expansion in the 
x-direction by a 
factor of k 


(k 71) 


| (х + Ку, у) 
Shear in the 


x-direction with 
factor k > 0 


y 
| (x + ky, y) 


EXAMPLE 1 Transforming with Diagonal Matrices — 


Suppose that the xy-plane first is compressed or expanded by a factor of ё in the x-direction and then is 
compressed or expanded by a factor of #2 in the y-direction. Find a single matrix operator that performs 
Solution The standard matrices for the two operations are 


both operations. 
у 0 1 0 
0 1 0 k2 


x-compression (expansion) y-compression (expansion) 


Thus, the standard matrix for the composition of the x-operation followed by the y-operation is 


п olk 0] |& 0 
(alle SH ы ° 


This shows that multiplication by a diagonal 2 x 2 matrix compresses or expands the plane in the 
x-direction and also in the y-direction. In the special case where & and & are the same, say ky = ^0 =k, 


Formula 1 simplifies to 
k 0 
А= 
Hi 


which is a contraction or a dilation (Table 7 of Section 4.9 ). 


EXAMPLE 2 Finding Matrix Operators + 


(a) Find the standard matrix for the operator on 22 that first shears by a factor of 2 in the x-direction and 
then reflects the result about the line y — x. Sketch the image of the unit square under this operator. 


(b) Find the standard matrix for the operator on 22 that first reflects about y = x and then shears by a 
factor of 2 in the x-direction. Sketch the image of the unit square under this operator. 


(c) Confirm that the shear and the reflection in parts (a) and (b) do not commute. 


Solution 


(a) The standard matrix for the shear is 
and for the reflection is 


Thus, the standard matrix for the shear followed by the reflection is 


НЕЕ 


(b) The standard matrix for the reflection followed by the shear is 


^el, it alli ol 


(c) The computations in Solutions (a) and (b) show that Ау 45 # 434, so the standard matrices, and 
hence the operators, do not commute. The same conclusion follows from Figures 4.11.3 and 4.11.4, 
since the two operators produce different images of the unit square. 


(1.1) 


Reflection | Shear in the 
about y 2x | x-direction 
| with k=2 
Figure 4.11.3 


Shear in the Reflection 


x-direction about y 2x 
with к= 2 ` | 
Figure 4.11.4 


Geometry of One-to-One Matrix Operators 


We will now turn our attention to one-to-one matrix operators on 22, which are important because they map distinct 


points into distinct points. Recall from Theorem 4.10.4 (the Equivalence Theorem) that a matrix transformation 7 д is 
one-to-one if and only if A can be expressed as a product of elementary matrices. Thus, we can analyze the effect of any 
one-to-one transformation 7 4 by first factoring the matrix A into a product of elementary matrices, say 


A= E E2. Ey 


and then expressing T 4as the composition 


T4A—Tg,g, p,— Tg; o Tg o...o Tg, Q) 


The following theorem explains the geometric effect of matrix operators corresponding to elementary matrices. 


THEOREM 4.11.1 


If E is an elementary matrtix, then Тр: R? — д? is one of the following: 
(a) Ashear along a coordinate axis. 

(b) Areflection about y — x. 

(c) A compression along a coordinate axis. 

(d) An expansion along a coordinate axis. 

(e) A reflection about a coordinate axis. 


(f) Acompression or expansion along a coordinate axis followed by a reflection about a coordinate axis. 


Proof Because а 2 x 2 elementary matrix results from performing a single elementary row operation on the 2 x 2 
identity matrix, such a matrix must have one of the following forms (verify): 


10 1 & 0 1 k 0 10 
k 1l jo 1[ 10/7 [0 1| [0 & 
The first two matrices represent shears along coordinate axes, and the third represents a reflection about y = x. Ifk = 0), 


the last two matrices represent compressions or expansions along coordinate axes, depending on whether 0 < & < 1 or 
k> 1. If < 0, and if we express & in the form = — &, where &, > 0, then the last two matrices can be written as 


ваа" 11 1109 4] э 
f ‚|| Alb E 4 (4) 


Since £1 > 0, the product in 3 represents a compression or expansion along the x-axis followed by a reflection about the 


y-axis, and 4 represents a compression or expansion along the y-axis followed by a reflection about the x-axis. In the 
case where ip — — 1, transformations 3 and 4 are simply reflections about the y-axis and x-axis, respectively. 


Since every invertible matrix is a product of elementary matrices, the following result follows from Theorem 4.11.1 and 
Formula 2. 


THEOREM 4.11.2 


IÉT 4: R? — д? is multiplication by an invertible matrix A, then the geometric effect of F 4 is the same as an 
appropriate succession of shears, compressions, expansions, and reflections. 


EXAMPLE 3 Analyzing the Geometric Effect of a Matrix Operator — 


Assuming that & and &2 are positive, express the diagonal matrix 


A ky 0 
10 k; 
as a product of elementary matrices, and describe the geometric effect of multiplication by А in terms of 
compressions and expansions. 


Solution From Example 1 we have 


ОЕ 


which shows that multiplication by А has the geometric effect of compressing or expanding by a factor of 
k in the x-direction and then compressing or expanding by a factor of & in the y-direction. 


EXAMPLE 4 Analyzing the Geometric Effect of a Matrix Operator — 


12 
А= 
34 
as a product of elementary matrices, and then describe the geometric effect of multiplication by A in terms 
of shears, compressions, expansions, and reflections. 


Express 


Solution A can be reduced to / as follows: 
| ] f 2 | l 1 [ 1 


3 4 0 —2 0 1 0 1 
1 1 1 
Айде Mune 1e Add =? times 
the first row second row the second row 
to the second. by -i to the first. 


The three successive row operations can be performed by multiplying А on the left successively by 


1 0 
1 0 1—2 
= | _ 1 =!) -i| з= |, \ 
2 
Inverting these matrices апа using Formula 4 of Section 1.5 yields 


-1 p-1 g-1 1 Of} 1 0||1 2 
= EQ Ej = 
AER S E ‘ll E ] 


Reading from right to left and noting that 


[ ;| E I | ;| 
0 -2 0-10 2 
it follows that the effect of multiplying by А is equivalent to 
1. shearing by a factor of 2 in the x-direction, 

. then expanding by a factor of 2 in the y-direction, 


2 
3. then reflecting about the x-axis, 
4 


. then shearing by a factor of 3 in the y-direction. 


Images of Lines Under Matrix Operators 


Many images in computer graphics are constructed by connecting points with line segments. The following theorem, 
some of whose parts are proved in the exercises, is helpful for understanding how matrix operators transform such 
figures. 


THEOREM 4.11.3 


If T- В? _, 8? is multiplication by an invertible matrix, then: 

(a) The image of a straight line is a straight line. 

(b) The image of a straight line through the origin is a straight line through the origin. 

(c) The images of parallel straight lines are parallel straight lines. 

(d) The image of the line segment joining points P and Q is the line segment joining the images of P and Q. 


(e) The images of three points lie on a line if and only if the points themselves lie on a line. 


Note that it follows from Theorem 4.11.3 that if A 1s 
an invertible 2 x 2 matrix, then multiplication by А 
maps triangles into triangles and parallelograms into 
parallelograms. 


EXAMPLE 5 Image ofa Square -* 


Sketch the image of the square with vertices (0, 0), (1, 1), and (0, 1) under multiplication by 


[24 


=1 2101 |0 -1 2]||1| |-1 
2 -1J[0] [0] 2 -1j]0| | 2 
-1 2[0| | 2 -1 2]|1| |l 
2 —1||1 —1| 2 —1||1 1 
the image of the square is a parallelogram with vertices (0, 0), ( — 1, 2), (2, — 1), and (1, 1) (Figure 
4.11.5). 


Solution Since 


(0, 1) 


(0, 0) 


(0, 0) 


Figure 4.11.5 


EXAMPLE 6 ImageofaLine < 
According to Theorem 4.11.3, the invertible matrix 
3 1 
A= 
21] 
maps ће line у = 2x + 1 into another line. Find its equation. 


Solution Let (x, y) bea point on the line y = 2x + 1, and let (x ' , y ' ) be its image under 
multiplication by A. Then 


PJE BIBIT] HA 3E] 


so 
m x! = y' 
у= -2x' + 3y' 
Substituting in y = 2x + 1 yields 
—2х' + 3y ' =2(x' —»') +1 or equivalently y ' =1х' +1 


Thus (x ' , y ' ) satisfies 


which is the equation we want. 


Concept Review 
* Effect of a matrix operator on the unit square 
* Geometry of one-to-one matrix operators 


* [mages of lines under matrix operators 


Skills 


* Find standard matrices for geometric transformations of 22. 


* Describe the geometric effect of an invertible matrix operator. 
* Find the image of the unit square under a matrix operator. 


* Find the image of a line under a matrix operator. 


Exercise Set 4.11 


1. Find the standard matrix for the operator Т. 52 _, 52 that maps a point (x, y) into 


(a) its reflection about the line y — — x. 
(b) its reflection through the origin. 
(c) its orthogonal projection on the x-axis. 


(d) its orthogonal projection on the y-axis. 


Answer: 


(a) | 0 -1 
-1 0 


[-1 0 
0 -1 


(a) [0 0 
0 1 


. For each part of Exercise 1, use the matrix you have obtained to compute 7(2, 1). Check your answers geometrically 
by plotting the points (2, 1) and 7(2, 1). 


N 


w 


. Find the standard matrix for the operator т. R? _, R? that maps a point (x, y, z) into 


(a) its reflection through the xy-plane. 
(b) its reflection through the xz-plane. 
(c) its reflection through the yz-plane. 


Answer: 


10 
0 1 
0 0 —1 


A 


л 


© 


- 


со 


60) |1 00 
0 —1 0 
0 01 
(с) |—1 0 0 
010 
001 

. For each part of Exercise 3, use the matrix you have obtained to compute 7(1, 1, 1). Check your answers 


geometrically by plotting the points (1, 1, 1) and 7(1, 1, 1). 


. Find the standard matrix for the operator т: R? — R? that 


(a) rotates each vector 90? counterclockwise about the z-axis (looking along the positive z-axis toward the origin). 
(b) rotates each vector 90? counterclockwise about the x-axis (looking along the positive x-axis toward the origin). 


(c) rotates each vector 90? counterclockwise about the y-axis (looking along the positive y-axis toward the origin). 


Answer: 


O O = 


(a) 


0 
1 
0 
(b) | 1 
0 
0 


(c) 


= соо 
| 
оон оно һо о 


or © 


. Sketch the image of the rectangle with vertices (0, 0), (1, 0), (1, 2), and (0, 2) under 


(a) a reflection about the x-axis. 


(b) a reflection about the y-axis. 


(с) a compression of factor k = i in the y-direction. 


(d) an expansion of factor i = 2 in the x-direction. 
(e) a shear of factor x = 3 in the x-direction. 


(f) a shear of factor i = 2 in the y-direction. 


. Sketch the image of the square with vertices (0, 0), (1, 0), (0, 1), and (1, 1) under multiplication by 


«Hj 


Rectangle with vertices at (0, 0), (—3, 0), (0, D, (—3, 1) 


Answer: 


. Find the matrix that rotates a point (x, y) about the origin 


(a) 45° 
(b) 90° 
(c) 180° 
(d) 270° 


(е) —30° 
9. Find the matrix that shears by 


(a) a factor of c = 4 in the y-direction. 


(b) a factor of & = — 2 in the x-direction. 


Answer: 


10. Find the matrix that compresses or expands by 


(a) a factor of 1 in the y-direction. 


3 


(b) a factor of 6 in the x-direction. 


11. In each part, describe the geometric effect of multiplication by A. 


(a) = F i 


0 1 
(b „_|1 0 
1 3] 
(с) ,„_|1 4 
|; ij 


Answer: 


(a) Expansion by a factor of 3 in the x-direction 
(b) Expansion by a factor of 5 in the y-direction and reflection about the x-axis 
(c) Shearing by a factor of 4 in the x-direction 


12. In each part, express the matrix as a product of elementary matrices, and then describe the effect of multiplication by 
A in terms of compressions, expansions, reflections, and shears. 


"a 
va 
EE 
eal] 


13. In each part, find a single matrix that performs the indicated succession of operations. 


(a) Compresses by a factor of i in the x-direction, then expands by a factor of 5 in the y-direction. 


(b) Expands by a factor of 5 in the y-direction, then shears by a factor of 2 in the y-direction. 
(c) Reflects about y — x, then rotates through an angle of 180? about the origin. 


Answer: 


14. In each part, find a single matrix that performs the indicated succession of operations. 
(a) Reflects about the y-axis, then expands by a factor of 5 in the x-direction, and then reflects about y — x. 
(b) Rotates through 30? about the origin, then shears by a factor of —2 in the y-direction, and then expands by a 
factor of 3 in the y-direction. 
15. Use matrix inversion to show the following. 
(a) The inverse transformation for a reflection about y — x is a reflection about y — x. 
(b) The inverse transformation for a compression along an axis is an expansion along that axis. 
(c) The inverse transformation for a reflection about a coordinate axis is a reflection about that axis. 


(d) The inverse transformation for a shear along a coordinate axis is a shear along that axis. 
16. Find an equation of the image of the line y = — 4x + 3 under multiplication by 
4 —3 
А= 
17. In parts (a) through (e), find an equation of the image of the line y — 2x under 
(a) a shear of factor 3 in the x-direction. 


(b) a compression of factor i in the y-direction. 


(c) a reflection about y = x. 
(d) a reflection about the y-axis. 


(e) a rotation of 60? about the origin. 


Answer: 

(a) у= 2х 

(b) Y= 

(c) у= іх 

(d) y= = 2х 

(е) e - 8 = | 


18. Find the matrix for a shear in the x-direction that transforms the triangle with vertices (0, 0), (2, 1), and (3, 0) into 
a right triangle with the right angle at the origin. 


19. (a) Show that multiplication by 


maps each point in the plane onto the line y = 2x. 


20. 


21. 
22. 


23. 


(b) It follows from part (a) that the noncollinear points (1, 0), (0, 1), ( — 1, 0) are mapped onto a line. Does this 
violate part (e) of Theorem 4.11.3? 


Answer: 


(b) No 


Prove part (a) of Theorem 4.11.3. [Hint: A line in the plane has an equation of the form Ax --By + C = 0, where А 
and B are not both zero. Use the method of Example 6 to show that the image of this line under multiplication by the 


invertible matrix 
@ 
c d 
has the equation 4' x + B' y + C = 0, where 
A' = (dA — c5) ! (ad — be) 
and 
B'—(-—BbÀA-FaB)! (ad — be) 
Then show that 4' and д! are not both zero to conclude that the image is a line.] 
Use the hint in Exercise 20 to prove parts (5) and (c) of Theorem 4.11.3. 


In each part of the accompanying figure, find the standard matrix for the operator described. 


(a) (5) (c) 
Figure Ex-22 


In д3 the shear in the xy-direction with factor k is the matrix transformation that moves each point (x, y, z) parallel 
to the xy-plane to the new position (x + kz, y + kz, z). (See the accompanying figure.) 
(a) Find the standard matrix for the shear in the xy-direction with factor К. 


(b) How would you define the shear in the xz-direction with factor k and the shear in the yz-direction with factor k? 


Find the standard matrices for these matrix transformations. 


Figure Ex-23 


Answer: 


(b) Shear in the xz-direction with 


factor k maps (x, y, z) to (x -- ky, y, z -- Ку): 


O O = 
a а 
= о о 


or o0 
= о о 


1 
Shear in the yz-direction with factor k maps (x, у, 2) to (x, y + kx, z + kx): | & 
k 


True-False Exercises 

In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 

(a) The image of the unit square under a one-to-one matrix operator is a square. 
Answer: 


False 
(b) A 2 « 2 invertible matrix operator has the geometric effect of a succession of shears, compressions, expansions, and 
reflections. 


Answer: 


True 


(c) The image of a line under a one-to-one matrix operator is a line. 
Answer: 


True 


(d) Every reflection operator on 22 is its own inverse. 


Answer: 


True 


(е) The matrix | | | | represents reflection about a line. 


Answer: 
False 


(0 The matrix E Ei represents a shear. 


Answer: 


False 


(g) The matrix [ ;| represents an expansion. 


Answer: 


True 
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4.12 Dynamical Systems and Markov Chains 


In this optional section we will show how matrix methods can be used to analyze the behavior of physical systems that 
evolve over time. The methods that we will study here have been applied to problems in business, ecology, 
demographics, sociology, and most of the physical sciences. 


Dynamical Systems 


A dynamical system is a finite set of variables whose values change with time. The value of a variable at a point in time 
is called the state of the variable at that time, and the vector formed from these states 15 called the state of the 
dynamical system at that time. Our primary objective in this section 1s to analyze how the state of a dynamical system 
changes with time. Let us begin with an example. 


EXAMPLE 1 Market Share as a Dynamical System + 


Suppose that two competing television channels, channel 1 and channel 2, each have 5096 of the viewer 
market at some initial point in time. Assume that over each one-year period channel 1 captures 1096 of 
channel 2's share, and channel 2 captures 20% of channel 1's share (see Figure 4.12.1). What is each 
channel's market share after one year? 


80% 90% 


Channel I loses 20% and 
holds 80%. 
Channel 2 loses 10% and 


holds 90%. 


Figure 4.12.1 


Solution Let us begin by introducing the time-dependent variables 
xj(f) = fraction of the market held by channel 1 at time £ 
x3(f) = faction of the market held by channel 2 at time £ 


and the column vector 


— Channel 2's fraction of the market at time £ in years 


(f) B | — Channel 1's fraction of the market at time ғ in years 
xlé = 
x(t) 


The variables x1(£) and x3(£) form a dynamical system whose state at time ¢ is the vector x (£). If we 
take ¢ = Q to be the starting point at which the two channels had 50% of the market, then the state of the 
system at that time is 

х1(0) [0.5 | + Channel l's fraction of the market at timez = 0 
x2(0) 0.5 | — Channel 2's fraction of the market at time ғ = 0 


Now let us try to find the state of the system at time ғ — | (one year later). Over the one-year period, 
channel 1 retains 8096 of its initial 5096, and it gains 1096 of channel 2's initial 5096. Thus, 


x(0) — | (1) 


х1(1)=0.8(0.5) + 0.1(0.5) = 0.45 (2) 
Similarly, channel 2 gains 20% of channel 1's initial 50%, and retains 9096 of its initial 50%. Thus, 
x3(1) = 0.2(0.5) + 0.9(0.5) = 0.55 (3) 


Therefore, the state of the system at time ; — 1 1s 


„|11 xi(1) _ [0.45 | — Channel 1's fraction of the market at time ғ = 1 
х2(1) 0.55 | — Channel 2's fraction of the market at time ¢= 1 


EXAMPLE 2 Evolution of Market Share over Five Years + 


Track the market shares of channels 1 and 2 in Example 1 over a five-year period. 


Solution To solve this problem suppose that we have already computed the market share of each 
channel at time ғ — ic and we are interested in using the known values of x4 (&) and x2(&) to compute the 
market shares х ү (4 + 1) and x3(& + 1) one year later. The analysis is exactly the same as that used to 
obtain Equations 2 and 3. Over the one-year period, channel 1 retains 80% of its starting fraction x, (4) 
and gains 10% of channel 2's starting fraction x3(4). Thus, 


xj Ck + 1) = (0.8)x1(&) + (0. 1)x2(&) (5) 


Similarly, channel 2 gains 20% of channel 1's starting fraction x; (4) and retains 90% of its own starting 
fraction x3(&). Thus, 


xalk + 1) = (0.2)x1 (4) + (0.9)x3(K) (6) 
Equations 5 and 6 can be expressed in matrix form as 
xik-1)| Гов o1] 109 
хэ®+1)| |02 0.9 || хобо U 
which provides a way of using matrix multiplication to compute the state of the system at time ¢ = k + 1 
from the state at time ¢ — &. For example, using 1 and 7 we obtain 


08 0.1 08 0.1|[|0.5 0.45 
xt) = [05 ЕСЫ “ү 
which agrees with 4. Similarly, 
08 0.1 0.8 0.1|| 0.45 0.415 
x2 =| 95 19/2 = [02 d Fel Edd 
We can now continue this process, using Formula 7 to compute x(3) from x(2), then x(4) from x(3), 
and so on. This yields (verify) 


0.3905 0.37335 0.361345 
= — 5 = 
a) р SEP К, x Кя (8) 


Thus, after five years, channel 1 will hold about 36% of the market and channel 2 will hold about 64%of 
the market. 


If desired, we can continue the market analysis in the last example beyond the five-year period and explore what 
happens to the market share over the long term. We did so, using a computer, and obtained the following state vectors 
(rounded to six decimal places): 


0.338041 0.333466 0.333333 
10) z 20) zx 40) сш 
"UM bred "nem hoses zx bres] 9) 


АП subsequent state vectors, when rounded to six decimal places, are the same as x(40), so we see that the market 
shares eventually stabilize with channel 1 holding about one-third of the market and channel 2 holding about 
two-thirds. Later in this section, we will explain why this stabilization occurs. 


Markov Chains 


In many dynamical systems the states of the variables are not known with certainty but can be expressed as 
probabilities; such dynamical systems are called stochastic processes (from the Greek word stokastikos, meaning 
“proceeding by guesswork”). A detailed study of stochastic processes requires a precise definition of the term 
probability, which 1s outside the scope of this course. However, the following interpretation will suffice for our present 
purposes: 


Stated informally, the probability that an experiment or observation will have a certain outcome is 
approximately the fraction of the time that the outcome would occur if the experiment were to be repeated many 
times under constant conditions—the greater the number of repetitions, the more accurately the probability 
describes the fraction of occurrences. 


For example, when we say that the probability of tossing heads with a fair coin is 1. we mean that if the coin were 


tossed many times under constant conditions, then we would expect about half of the outcomes to be heads. 
Probabilities are often expressed as decimals or percentages. Thus, the probability of tossing heads with a fair coin can 
also be expressed as 0.5 or 5096. 


If an experiment or observation has n possible outcomes, then the probabilities of those outcomes must be nonnegative 
fractions whose sum is 1. The probabilities are nonnegative because each describes the fraction of occurrences of an 
outcome over the long term, and the sum is 1 because they account for all possible outcomes. For example, if a box 
containing 10 balls has one red ball, three green balls, and six yellow balls, and if a ball is drawn at random from the 
box, then the probabilities of the various outcomes are 


pi—prob(red) = 1/ 10= 0.1 
рэ = prob(green) = 3 / 10 = 0.5 
рз = prob(yellow) = 6 / 10 — 0.6 
Each probability is a nonnegative fraction and 
pi + рэ + рз= 01+ 0.3 + 0.6 = 1 


In a stochastic process with n possible states, the state vector at each time t has the form 


x4(é) | Probability that the system is in state 1 


ijs x3(£) |Probability that the system is in state 2 


x» (£) | Probability that the system is in state zz 


The entries in this vector must add up to 1 since they account for all л possibilities. In general, a vector with 
nonnegative entries that add up to 1 1s called a probability vector. 


EXAMPLE 3 Example 1 Revisited from the Probability Viewpoint + 


Observe that the state vectors in Example 1 and Example 2 are all probability vectors. This 1s to be 
expected since the entries in each state vector are the fractional market shares of the channels, and together 
they account for the entire market. In practice, it is preferable to interpret the entries in the state vectors as 
probabilities rather than exact market fractions, since market information 1s usually obtained by statistical 
sampling procedures with intrinsic uncertainties. Thus, for example, the state vector 


х1(1 
к(1) = 11) | [045 
х2(1) 0.55 
which we interpreted in Example 1 to mean that channel 1 has 4596 of the market and channel 2 has 55%, 


can also be interpreted to mean that an individual picked at random from the market will be a channel 1 
viewer with probability 0.45 and a channel 2 viewer with probability 0.55. 


A square matrix, each of whose columns is a probability vector, is called a stochastic matrix. Such matrices commonly 
occur in formulas that relate successive states of a stochastic process. For example, the state vectors x(& + 1) and x(k) 
in 7 are related by an equation of the form x(k + 1) = Px(&) in which 


[o8 01 
P-[5: as] (y 


is a stochastic matrix. It should not be surprising that the column vectors of P are probability vectors, since the entries 
in each column provide a breakdown of what happens to each channel's market share over the year—the entries in 
column 1 convey that each year channel 1 retains 80% of its market share and loses 20%; and the entries in column 2 
convey that each year channel 2 retains 90% of its market share and loses 10%. The entries in 10 can also be viewed as 


probabilities: 


ри = 0.8 = probabilty that a channel 1 viewer remains a channel 1 viewer 
рэ = 0.2 = probability that a channel 1 viewer becomes a channel 2 viewer 
рз = 0.1 = probability that a channel 2 viewer becomes a channel 1 viewer 
рээ = 0.9 = probability that a channel 2 viewer remains a channel 2 viewer 


Example 1 is a special case of a large class of stochastic processes, called Markov chains. 


Andrei Andreyevich Markov (1856—1922) 


Historical Note Markov chains are named in honor of the Russian mathematician A. A. Markov, a lover of 
poetry, who used them to analyze the alternation of vowels and consonants in the poem Eugene Onegin by 
Pushkin. Markov believed that the only applications of his chains were to the analysis of literary works, so he 
would be astonished to learn that his discovery is used today in the social sciences, quantum theory, and 
genetics! 

[Image: wikipedia] 


DEFINITION 1 


A Markov chain is a dynamical system whose state vectors at a succession of time intervals are probability 
vectors and for which the state vectors at successive time intervals are related by an equation of the form 


x(k + 1) = Px(&) 
in which P = [pij] is a stochastic matrix and Py is the probability that the system will be in state i at time 
t =k + 1 ifit is in state at time ғ — ic. The matrix Р is called the transition matrix for the system. 


Remark Note that in this definition the row index i corresponds to the later state and the column index j to the earlier 
state (Figure 4.12.2). 


State at time 2 = K 


State at time 


Ру |е take 


The entry p; is the probability 
that the system is in state i at 
time t 2 k + 1 if it is in state j 
at time t =k. 


Figure 4.12.2 


EXAMPLE 4 Wildlife Migration as a Markov Chain + 


Suppose that a tagged lion can migrate over three adjacent game reserves in search of food, reserve 1, 
reserve 2, and reserve 3. Based on data about the food resources, researchers conclude that the monthly 
migration pattern of the lion can be modeled by a Markov chain with transition matrix 


Reserve at time £ = & 


1 2 3 
05 04 06] 1 
p = |02 02 03| 2 Reserve at time = & + 1 
03 04 0.1] 3 


(see Figure 4.12.3). That is, 
pij = 0.5 = probability that the lion will stay in reserve 1 when it is in reserve 1 
різ = 0.4 = probability that the lon will move from reserve 2 to reserve 1 
різ = 0.6 = probability that the lion will move from reserve 3 to reserve 1 
рәү = 0.2 = probability that the lion will move from reserve 1 to reserve 2 
рээ =0.2=probability that the lion will stay in reserve 2 when it is in reserve 2 
рэз = 0.3 = probability that the lion will move from reserve 3 to reserve 2 
рзі = 0.3 = probability that the lion will move from reserve 1 to reserve 3 
p32 = 0.4 = probability that the lion will move from reserve 2 to reserve 5 
рзз = 0.1 = probability that the lion will stay in reserve 3 when it is in reserve 3 


Assuming that is in months and the lion is released in reserve 2 at time ¢ — Q, track its probable 
locations over a six-month period. 


0.5 


CON 


Reserve 
0.2 0.3 
fos oN 


бт" кш "n 
704° E 


Figure 4.12.3 


Solution Let x(k), x3(k), and хз(&) be the probabilities that the lion is in reserve 1, 2, or 3, 
respectively, at time ғ — iz, and let 
x1(k) 
x(k) = | х2() 
x3(k) 


be the state vector at that time. Since we know with certainty that the lion is in reserve 2 at time ғ = Q, the 
initial state vector is 
0 
x(0) 2|1 
0 


We leave it for you to show that the state vectors over a six-month period are 


0.400 0.520 0.500 
x(1) =Px(0) =| 0.200 |, x(2) = Px(1) =| 0.240 |, x(3) = Px(2) = | 0.224 
0.400 0.240 0.276 
0.505 0.504 0.504 
х(4) -—Px(3)m|0.228 |, x(5) = Px(4) x | 0.227 |, x(6) = Px(5) m | 0.227 
0.267 0.269 0.269 


As in Example 2, the state vectors here seem to stabilize over time with a probability of approximately 
0.504 that the lion 1s in reserve 1, a probability of approximately 0.227 that it is in reserve 2, and a 
probability of approximately 0.269 that it is in reserve 3. 


Markov Chains in Terms of Powers of the Transition Matrix 


In a Markov chain with an initial state of x (0), the successive state vectors are 
x(1) = Рх(0), x(2) = Px(1), x(3) = Px(2), x(4) = Px(3), ... 
For brevity, it is common to denote x(&) by Xi, which allows us to write the successive state vectors more briefly as 
xy = xp, x? = Рх, x3 = Px, x4— Px3,... (11) 
Note that Formula 12 makes it possible to compute 
the state vector Xi without first computing the 
earlier state vectors as required in Formula 11. 


Alternatively, these state vectors can be expressed in terms of the initial state vector хү as 
x; = Pxg, x— P(Pxo) = P?xy, x3— P(P^xo) = P?xg, x4— P(P*xo| = Р®ху,... 


from which it follows that 


xj, = Рх) (12) 


EXAMPLE 5 Finding a State Vector Directly from хо + 


Use Formula 12 to find the state vector x(5) in Example 2. 


Solution From 1 and 7, the initial state vector and transition matrix are 
0.5 08 0.1 
w=x[0)= [55 ш Р= 02 dH 
We leave it for you to calculate P? and show that 
х[З|=х»—=РЗхо = 0.562 0.219|]|0.5| 10.3905 
0377 70^ [0438 0.781} [0.5 | [0.6095 


which agrees with the result in 8. 


Long-Term Behavior of a Markov Chain 


We have seen two examples of Markov chains in which the state vectors seem to stabilize after a period of time. Thus, 
it is reasonable to ask whether all Markov chains have this property. The following example shows that this is not the 
case. 


EXAMPLE 6 A Markov Chain That Does Not Stabilize — 


- 


is stochastic and hence can be regarded as the transition matrix for a Markov chain. A simple calculation 
shows that 22 — у, from which it follows that 


= Рі =P= рб =. and P =P? =P? =P’ =... 


Thus, the successive states in the Markov chain with initial vector Xp are 


The matrix 


xg, Pxg, хр, Pxp, Xp, --- 
which oscillate between хо and хп. Thus, the Markov chain does not stabilize unless both components 
of хо are i (verify). 


A precise definition of what it means for a sequence of numbers or vectors to stabilize is given in calculus; however, 
that level of precision will not be needed here. Stated informally, we will say that a sequence of vectors 


Xj, X5,.., Xk --- 


approaches a limit 4 or that it converges to 4 if all entries in хд; can be made as close as we like to the corresponding 
entries in the vector 9 by taking k sufficiently large. We denote this by writing Xk —> 4 as k — oo. 


We saw in Example 6 that the state vectors of a Markov chain need not approach a limit in all cases. However, by 


imposing a mild condition on the transition matrix of a Markov chain, we can guarantee that the state vectors will 
approach a limit. 


DEFINITION 2 


A stochastic matrix P 15 said to be regular if P or some positive power of P has all positive entries, and a 
Markov chain whose transition matrix is regular 15 said to be a regular Markov chain. 


EXAMPLE 7 Regular Stochastic Matrices — 


The transition matrices in Example 2 and Example 4 are regular because their entries are positive. The 


matrix 
05 1 
= |» J 
is regular because 
2_|075 0.5 
[0.25 0.5 


has positive entries. The matrix P in Example 6 is not regular because P and every positive power of P 
have some zero entries (verify). 


The following theorem, which we state without proof, is the fundamental result about the long-term behavior of 
Markov chains. 


THEOREM 4.12.1 


If P is the transition matrix for a regular Markov chain, then: 

(a) There is a unique probability vector 4 such that Pq = q. 

(b) For any initial probability vector х0, the sequence of state vectors 
xp, PXxp, -.., P*xg, = 


converges to q. 


The vector Ч in this theorem is called the steady-state vector of the Markov chain. It can be found by rewriting the 
equation in part (a) as 


=—P)q=0 


and then solving this equation for 4 subject to the requirement that 9 be a probability vector. Here are some examples. 


EXAMPLE 7 Example 1 and Example 2 Revisited + 


The transition matrix for the Markov chain in Example 2 is 
0.8 0.1 
P= 
| 0.2 0.9 | 
Since the entries of P are positive, the Markov chain is regular and hence has a unique steady-state vector 
ч. To find 4 we will solve the system (/ — P)q = 0, which we can write as 


|02 01192 0) 


41= 0.55, 427 = 5 


The general solution of this system 15 


(verify), which we can write in vector form as 


| (13) 


For 9 to be a probability vector, we must have 
5 
12414 42555 


which implies that s = 2. Substituting this value in 13 yields the steady-state vector 


g= 


Ww bole 


which is consistent with the numerical results obtained in 9. 


EXAMPLE 9 Example 4 Revisited + 


The transition matrix for the Markov chain in Example 4 is 
0.5 04 06 


P=/0.2 02 03 
0.3 04 0.1 


Since the entries of P are positive, the Markov chain is regular and hence has a unique steady-state vector 
q. To find d we will solve the system (/ — P)q = 0, which we can write (using fractions) as 


L 3 3 
E 5 : ys | 
= x “all lol (14) 
Em Ж Т E. 

10 5 10 


(We have converted to fractions to avoid roundoff error in this illustrative example.) We leave it for you 


to confirm that the reduced row echelon form of the coefficient matrix 1s 


and that the general solution of 14 is 
21 (15) 


32 


For 9 to be a probability vector we must have gj + q2 + q3 = 1, from which it follows that s = П 


(verify). Substituting this value in 15 yields the steady-state vector 


60 


11 0.5042 

а= $5 = | 0.2269 
е 0.2689 
119 


(verify), which is consistent with the results obtained in Example 4. 


Concept Review 

* Dynamical system 

* State of a variable 

* State of a dynamical system 
* Stochastic process 

* Probability 

Probability vector 


Stochastic matrix 


Markov chain 


Transition matrix 


Regular stochastic matrix 


Regular Markov chain 


Steady-state vector 


Skills 

* Determine whether a matrix is stochastic. 

* Compute the state vectors from a transition matrix and an initial state. 
* Determine whether a stochastic matrix 15 regular. 

* Determine whether a Markov chain is regular. 


* Find the steady-state vector for a regular transition matrix. 


Exercise Set 4.12 


In Exercises 1—2, determine whether А is a stochastic matrix. If А 1s not stochastic, then explain why not. 


1 
(а) „_ |04 0.3 
4- |04 i 


(b , [04 06 
4- [03 | 


(с) 11 1 
2 3 
A=|0 0 1 
3 
д1 
023 
(d) do 1s X 
3 3 2 
il 2 
4-|$ 3 2 
11 
23 ! 
Answer: 


(a) Stochastic 
(b) Not stochastic 
(c) Stochastic 
(d) Not stochastic 


(b) 4— V. 
4-[55 01 
(c) A 11 
12 9 6 
| Т 5 
A=|5 0 2 
3.8 
23 
Че. umido 
3 2 
= ә 
Ace. dom 
1 
E 


In Exercises 3—4, use Formulas 11 and 12 to compute the state vector X4 in two different ways. 
3. p. 0.5 0.6 m 0.5 
10:504727 [0:5 
Answer: 
0.54545 
0.45455 
4: z- 105 0. _|1 
= |; | “= |0) 


In Exercises 5—6, determine whether P is a regular stochastic matrix. 


"aer. [її 
Ds sy 
P= 
4 6 
5 7 
© [1% 
5 
P= 
24 
5 
e [1i 
Р= 
4 9 
5 
Answer: 
(a) Regular 
(b) Not regular 
(c) Regular 
6. (a) 1l] 
2 
P= 
lp 
2 
6 [2 
3 
P= 
о 1 
3 
(с) 3. 
4 3 
мо 
4 3 


In Exercises 7—10, verify that P is a regular stochastic matrix, and find the steady-state vector for the associated 
Markov chain. 


7. 1 2 
4 3 
M geo 
4 3 
Answer 
8 
17 
3 
17 


8&5 [02 06 
= |$ | 


л 1 
2 2 0 
а UM! 
= |2535 
A 2 
с 
Answer 
4 
11 
4. 
11 
A. 
11 
10. 112 
3 4 5 
3 2 
Р=|0 7 5 
2 1 
2 Ro 


11. Consider a Markov process with transition matrix 
State 1 State 2 


Statel |0.2 0.1 
State2 |0.8 0.9 


(a) What does the entry 0.2 represent? 
(b) What does the entry 0.1 represent? 
(c) If the system is in state | initially, what is the probability that it will be in state 2 at the next observation? 


(d) If the system has a 50% chance of being in state | initially, what is the probability that it will be in state 2 at the 
next observation? 


Answer: 


(a) Probability that something in state 1 stays in state 1 
(b) Probability that something in state 2 moves to state 1 
(c) 0.8 

(d) 0.85 


12. Consider a Markov process with transition matrix 
State 1 State 2 


State 1 
State 2 1 


„|с | 


(a) What does the entry $ represent? 


(b) What does the entry 0 represent? 


13. 


14. 


15. 


(c) If the system is in state 1 initially, what is the probability that it will be in state 1 at the next observation? 
(d) If the system has a 50% chance of being in state 1 initially, what 1s the probability that it will be in state 2 at the 


next observation? 


On a given day the air quality in a certain city 1s either good or bad. Records show that when the air quality is good 
on one day, then there is a 95% chance that it will be good the next day, and when the air quality is bad on one day, 
then there is a 45% chance that it will be bad the next day. 


(a) Find a transition matrix for this phenomenon. 
(b) If the air quality is good today, what is the probability that it will be good two days from now? 
(c) If the air quality is bad today, what is the probability that it will be bad three days from now? 


(d) If there is a 20% chance that the air quality will be good today, what is the probability that it will be good 
tomorrow? 


Answer: 


(a) [0.95 0.55 
0.05 0.45 


(b) 0.93 

(c) 0.142 

(d) 0.63 

In a laboratory experiment, a mouse can choose one of two food types each day, type I or type II. Records show that 


if the mouse chooses type I on a given day, then there is a 75% chance that it will choose type I the next day, and if 
it chooses type П on one day, then there is a 50% chance that it will choose type II the next day. 


(a) Find a transition matrix for this phenomenon. 

(b) If the mouse chooses type I today, what is the probability that it will choose type I two days from now? 

(c) Ifthe mouse chooses type II today, what is the probability that it will choose type II three days from now? 

(d) If there is a 10% chance that the mouse will choose type I today, what is the probability that it will choose type 


I tomorrow? 


Suppose that at some initial point in time 100,000 people live in a certain city and 25,000 people live in its suburbs. 
The Regional Planning Commission determines that each year 5% of the city population moves to the suburbs and 
3% of the suburban population moves to the city. 


(a) Assuming that the total population remains constant, make a table that shows the populations of the city and its 
suburbs over a five-year period (round to the nearest integer). 


(b) Over the long term, how will the population be distributed between the city and its suburbs? 


Answer: 

(a) 
Year 1 2 3 4 5 
City 95,750 | 91,840] 88,243 | 84,933] 81,889 
Suburbs} 29,250 | 33,160| 36,757 | 40,067 | 43,111 

(b) 


City 46,875 


Suburbs| 78.125 | 


16. Suppose that two competing television stations, station 1 and station 2, each have 5096 of the viewer market at some 


17. 


1 


oc 


initial point in time. Assume that over each one-year period station 1 captures 596 of station 2's market share and 
station 2 captures 10% of station 1's market share. 


(a) Make a table that shows the market share of each station over a five-year period. 
(b) Over the long term, how will the market share be distributed between the two stations? 
Suppose that a car rental agency has three locations, numbered 1, 2, and 3. A customer may rent a car from any of 


the three locations and return it to any of the three locations. Records show that cars are rented and returned in 
accordance with the following probabilities: 


Rented from Location 


Returned to Location 2 


(a) Assuming that a car is rented from location 1, what is the probabihty that it will be at location 1 after two 
rentals? 


(b) Assuming that this dynamical system can be modeled as a Markov chain, find the steady-state vector. 


(c) If the rental agency owns 120 cars, how many parking spaces should it allocate at each location to be 
reasonably certain that it will have enough spaces for the cars over the long term? Explain your reasoning. 


Answer: 


223. 
© 100 


(b) | 46. 
159 
22 
53 
Ar 
159 


(c) 35, 50,35 


. Physical traits are determined by the genes that an offspring receives from its parents. In the simplest case a trait in 


the offspring is determined by one pair of genes, one member of the pair inherited from the male parent and the 
other from the female parent. Typically, each gene in a pair can assume one of two forms, called alleles, denoted by 
A and a. This leads to three possible pairings: 

АА, Aa, aa 


called genotypes (the pairs Аа and aA determine the same trait and hence are not distinguished from one another). It 
is shown in the study of heredity that if a parent of known genotype is crossed with a random parent of unknown 
genotype, then the offspring will have the genotype probabilities given in the following table, which can be viewed 
as a transition matrix for a Markov process: 


19. 


2 
2 


2 


23. 


0. 
1. 


2. 


w 


Genotype of Parent 
AA Aa aa 


AA 


Genotype of Offspring Aa 


Thus, for example, the offspring of a parent of genotype 4A that is crossed at random with a parent of unknown 
genotype will have a 50% chance of being AA, a 50% chance of being Aa, and no chance of being aa. 


(a) Show that the transition matrix 1s regular. 


(b) Find the steady-state vector, and discuss its physical interpretation. 


Fill in the missing entries of the stochastic matrix 


* Ul 


— 
ole 


and find its steady-state vector. 


Answer: 
Е Ш І 
10 10 5 3 
| sr Шо 
Fes. ens ram 
do 3.15 1 
10 5 10 3 
If P is an » x y stochastic matrix, and if M is a 1 x м matrix whose entries are all 1's, then MP = 


If P is a regular stochastic matrix with steady-state vector 4, what can you say about the sequence of products 
Pq, P?q, Pq, Mu Р“, rat 

as k > со? 

Answer: 

př q = q for every positive integer k 


(a) If P is a regular » x »; stochastic matrix with steady-state vector Ч, and if е1, ез, ..., е„ are the standard unit 
vectors in column form, what can you say about the behavior of the sequence 
Pej, P?e;, Р?е;, "ER P*e,, е 
as k — œo for each i = 1, 2, .. 2? 


(b) What does this tell you about the behavior of the column vectors of Р“ as i... со? 


Prove that the product of two stochastic matrices is a stochastic matrix. [Hint: Write each column of the product as 


a linear combination of the columns of the first factor. 


24. Prove that if P is a stochastic matrix whose entries are all greater than or equal to p, then the entries of p? are 
greater than or equal to p. 


True-False Exercises 
In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 
(a) 


The vector Is a probability vector. 


WIP © Wile 


Answer: 
True 


(b) The matrix he 1 15 a regular stochastic matrix. 


Answer: 


True 


(c) The column vectors of a transition matrix are probability vectors. 
Answer: 


True 


(d) A steady-state vector for a Markov chain with transition matrix P is any solution of the linear system (/ — P)q = 0. 
Answer: 


False 


(e) The square of every regular stochastic matrix is stochastic. 
Answer: 


True 


Copyright (O 2010 John Wiley & Sons, Inc. All rights reserved. 


Chapter 4 Supplementary Exercises 


1. Let V be the set of all ordered pairs of real numbers, and consider the following addition and scalar 
multiplication operations on u = (1, #3, 3) and v = (у, v3, v3): 
u+ v= (01 Ур, ua Hya, из уз), Au= (ku, 0, 0) 

(a) Compute u + y and £y for u = (5, — 2, 4), v = (1, 5, — 2), and = = 1. 

(b) In words, explain why V is closed under addition and scalar multiplication. 

(c) Since the addition operation on V is the standard addition operation on RŽ, certain vector space axioms 
hold for V because they are known to hold for 23. Which axioms in Definition 1 of Section 4.1 are 
they? 

(d) Show that Axioms 7, 8, and 9 hold. 


(e) Show that Axiom 10 fails for the given operations. 


Answer: 


(a) u+ = (4, 3, 2), -u=(—3, 0, 0) 
(с) Ахіотѕ 1—5 


2. In each part, the solution space of the system is a subspace of 22 and so must be а line through the origin, 
a plane through the origin, all of 23, or the origin only. For each system, determine which is the case. If 
the subspace is a plane, find an equation for it, and if it is a line, find parametric equations. 

(a) Ox + 0у + 0z — 0 
(b) 2x=3y+ z=0 
бх —9y + 3z=0 
— 4x + бу = 2z = 0 
(c) x —2y + 72 —0 
= 4x + 8y + 5z — 0 
2х —4y + 3z = 0 
(d x+4y+8&=0 
2х + 5y + 6z = 0 
3x+ y —4z-—0 


3. For what values of s is the solution space of 
xi + х2 + 5х3 = 0 
xi 5х2 + х3 = 0 
sxi + x2+ х3 = 0 
the origin only, a line through the origin, a plane through the origin, or all of 22? 


Answer: 


Ifs#1, — 2, the solution space is the origin. If ; — 1, the solution space is a plane through the origin. If 
g= = 2, the solution space is a line through the origin. 
4. (a) Express (Да, a — b, a + 2b) as a linear combination of (4, 1, 1) and (0, = 1, 2). 


(b) Express (3a + b + 2с, — а -- 4b — с, 2a + b + 2c) as a linear combination of (3, — 1, 2) and 
(1,4, 1). 


(c) Express (2a = b + 4c, 3a — c, 4b +c) asa linear combination of three nonzero vectors. 


5. Let W be the space spanned by f — sin x and g = Cos x. 
(a) Show that for any value of 9, f 1 = sin(x +) and g; = cos{x +6 6) are vectors in W. 
(b) Show that f ; and £1 form a basis for W. 


б. (а) Express v = (1, 1) asa linear combination of v; = (1, — 1), v2 = (3, 0), and v5 = (2, 1) in two 
different ways. 


(b) Explain why this does not violate Theorem 4.4.1. 


7. Let A be an у x » matrix, and let v4, v5, ..., Vy be linearly independent vectors in R” expressed as » x 1 
matrices. What must be true about А for Av, Av3, ..., Av, to be linearly independent? 


Answer: 


A must be invertible 
8. Must a basis for P,, contain a polynomial of degree k for each k = 0, 1, 2, ..., м? Justify your answer. 


9. For the purpose of this exercise, let us define a “checkerboard matrix” to be a square matrix A = [aj] 
such that 


І fi+ jis even 
"HQ іч. jis odd 
Find the rank and nullity of the following checkerboard matrices. 
(a) The 3 x 3 checkerboard matrix. 
(b) The 4 x 4 checkerboard matrix. 


(с) The » x y checkerboard matrix. 


Answer: 


(a) Rank = 2, nullity = 1 
(b) Rank = 2, nullity = 2 
(c) Rank = 2, nullity = и —2 


10. For the purpose of this exercise, let us define an “X-matrix” to be a square matrix with an odd number of 
rows and columns that has 0's everywhere except on the two diagonals where it has I's. Find the rank and 
nullity of the following X-matrices. 

(а) }1 0 1 
0 1 0 
101 


п 


12. 


13. 


14. 


м о о о KH 
or OK © 
or 00 
се Or © 
oo о н 


001 
(c) the X-matrix of size (2% + 1) x (20 + 1) 


. In each part, show that the stated set of polynomials is a subspace of P,, and find a basis for it. 


(a) All polynomials in P,, such that p( — х) = р(х). 
(b) All polynomials in P,, such that p(0) = 0. 


Answer: 


(a) П, £t F =>- x where 254; = y if is even and 25; = — ] ifn is odd. 
LEE ыы 
(Calculus required) Show that the set of all polynomials in P, that have a horizontal tangent at x — Q is a 


subspace of Р„. Find a basis for this subspace. 


(a) Find a basis for the vector space of all 3 x 3 symmetric matrices. 


(b) Find a basis for the vector space of all 3 x 3 skew-symmetric matrices. 


Answer: 

(a) |1100 0 1 0 0 0 1 00 0 00 0 00 0 
00 0 1001, 1000 010 00 11, 1000 
00 0 00 0 100 00 0 010 001 

(b) 0 1 0 0.0 1 0 00 
-100 00 0 0 0 1 
00 0 —1 0 0 0 —1 0 


Various advanced texts in linear algebra prove the following determinant criterion for rank: The rank of a 
matrix A is r if and only if A has some r х ғ submatrix with a nonzero determinant, and all square 
submatrices of larger size have determinant zero. [Note: A submatrix of A is any matrix obtained by 
deleting rows or columns of A. The matrix А itself is also considered to be a submatrix of A.] In each part, 
use this criterion to find the rank of the matrix. 


(а) |12 0 
E 4 “| 

[123 
E 4 d 


(o) |1. 01 


(d) 1-120 
3 100 
-] 240 
15. Use the result in Exercise 14 above to find the possible ranks for matrices of the form 
0 0 0 0 DO aig 
0 0 0 0 ODO ag 
0 0 0 0 0 as 


0 0 0 0 O0 a 
G5] @52 453 dG54 dG55 256 


Answer: 


Possible ranks are 2, 1, and 0. 


16. Prove: If S is a basis for a vector space Г”, then for any vectors ц and y in V and any scalar k, the following 
relationships hold. 


(a) (++) g= (и) 5 + (v) 5 
(b) (ku) s— &(u) 5 
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5.4. Differential Equations 


INTRODUCTION 


In this chapter we will focus on classes of scalars and vectors known as "eigenvalues" and 
“eigenvectors,” terms derived from the German word eigen, meaning “own,” “peculiar 
to," “characteristic,” or “individual.” The underlying idea first appeared in the study of 
rotational motion but was later used to classify various kinds of surfaces and to describe 
solutions of certain differential equations. In the early 1900s it was applied to matrices and 
matrix transformations, and today it has applications in such diverse fields as computer 
graphics, mechanical vibrations, heat flow, population dynamics, quantum mechanics, and 
economics to name just a few. 
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5.1 Eigenvalues and Eigenvectors 


In this section we will define the notions of “eigenvalue” and “eigenvector” and discuss some of their basic 
properties. 


Definition of Eigenvalue and Eigenvector 


We begin with the main definition in this section. 


DEFINITION 1 


If A is an » x д matrix, then a nonzero vector x in 8” is called an eigenvector of A (or of the matrix 
operator 7' 4) if 4x is a scalar multiple of x; that is, 

Ах = Ах 
for some scalar A. The scalar A is called an eigenvalue of A (or of T 4), and x is said to be an 
eigenvector corresponding to A. 


The requirement that an eigenvector be 
nonzero is imposed to avoid the unimportant 
case 40 = А0, which holds for every A and X 


In general, the image of a vector x under multiplication by a square matrix A differs from x in both magnitude 
and direction. However, in the special case where x is an eigenvector of А, multiplication by A leaves the 
direction unchanged. For example, in 22 or 2? multiplication by А maps each eigenvector x of A (if any) 
along the same line through the origin as x. Depending on the sign and magnitude of the eigenvalue A 
corresponding to x, the operation 4x = Xx compresses or stretches x by a factor of Д, with a reversal of 
direction in the case where Д is negative (Figure 5.1.1). 


AX P di 0 
0 0 


Ах Ах 


(a) 0SàÀ <| (b) As] (c) -1 5A €0 (d) A €-1 


Figure 5.1.1 


EXAMPLE 1 Eigenvector of a 2 x 2 Matrix + 


The vector x = H is an eigenvector of 


corresponding to the eigenvalue 4 — 3, since 


-i -> 


Geometrically, multiplication by А has stretched the vector x by a factor of 3 (Figure 5.1.2). 


Figure 5.1.2 


Computing Eigenvalues and Eigenvectors 


Our next objective is to obtain a general procedure for finding eigenvalues and eigenvectors of an » x з 
matrix А. We will begin with the problem of finding the eigenvalues of А. Note first that the equation 


Ах = Ах сап be rewritten as Ах = \/x, or equivalently, as 

(АЈ = А)х = 0 
For A to be an eigenvalue of A this equation must have a nonzero solution for x. But it follows from parts (b) 
and (g) of Theorem 4.10.4 that this is so if and only if the coefficient matrix A — А has a zero determinant. 
Thus, we have the following result. 


THEOREM 5.1.1 


If A is an у x; х matrix, then Д is an eigenvalue of A if and only if it satisfies the equation 
det(A 7 — 4) = 0 (1) 


This is called the characteristic equation of A. 


EXAMPLE 2 Finding Eigenvalues + 


In Example 1 we observed that 4 = 3 is an eigenvalue of the matrix 


з 0 
А = 
but we did not explain how we found it. Use the characteristic equation to find all eigenvalues 


of this matrix. 


Solution It follows from Formula 1 that the eigenvalues of A are the solutions of the equation 
det(A? — А) = 0, which we can write as 


A-3 0 |. 
—8 A+1 
from which we obtain 
(А = 3) (А+ 1) =0 (2) 
This shows that the eigenvalues of А are V = 3 and V = — 1. Thus, in addition to the 
eigenvalue д — 3 noted in Example 1, we have discovered a second eigenvalue  — = 1. 


When the determinant det(À/ — A) that appears on the left side of 1 is expanded, the result is a polynomial 
p (X) of degree n that is called the characteristic polynomial of A. For example, it follows from 2 that the 
characteristic polynomial of the 2 x 2 matrix A in Example 2 is 


pd) = (A-3)(à-1) 242-221 -3 
which is a polynomial of degree 2. In general, the characteristic polynomial of an x x з; matrix has the form 
PIA) =A ey l be, 


in which the coefficient of \” is 1 (Exercise 17). Since a polynomial of degree n has at most n distinct roots, it 
follows that the equation 


Meat +...++с„=0 (3) 
has at most n distinct solutions and consequently that an y x з matrix has at most n distinct eigenvalues. Since 
some of these solutions may be complex numbers, it is possible for a matrix to have complex eigenvalues, 


even if that matrix itself has real entries. We will discuss this issue in more detail later, but for now we will 
focus on examples in which the eigenvalues are real numbers. 


EXAMPLE 3 Eigenvalues of a 3 x 3 Matrix — 


Find the eigenvalues of 


Solution The characteristic polynomial of A is 


i <1 0 
det(A – А) = деу 0 A —1 | =A?—8A7417A-4 
-4 17 А=8 


The eigenvalues of A must therefore satisfy the cubic equation 


АЗ = BA? + 174-4 =0 (4) 


To solve this equation, we will begin by searching for integer solutions. This task can be 
simplified by exploiting the fact that all integer solutions (if there are any) of a polynomial 
equation with integer coefficients 


M oA n +....++с„=0 


In applications involving large matrices 
it is often not feasible to compute the 
characteristic equation directly so other 
methods must be used to find 
eigenvalues. We will consider such 
methods in Chapter 9. 


must be divisors of the constant term, ĉn. Thus, the only possible integer solutions of 4 are the 
divisors of —4, that 15, + 1, + 2, +4. Successively substituting these values in 4 shows that 
A= 4 is an integer solution. As a consequence,  — 4 must be a factor of the left side of 4. 
Dividing А — 4 into? — 842 + 17А — 4 shows that 4 can be rewritten as 


(A—-4) (А-АА | 1)= 0 
Thus, the remaining solutions of 4 satisfy the quadratic equation 
which can be solved by the quadratic formula. Thus the eigenvalues of А are 
A=4, A=2+ y3, and A—-2-43 


EXAMPLE 4 Eigenvalues of an Upper Triangular Matrix + 


Find the eigenvalues of the upper triangular matrix 
a1] @12 813 214 
0 an an а24 
0 0 азз аза 
0 0 0 ay 


A= 


Solution Recalling that the determinant of a triangular matrix is the product of the entries оп 
the main diagonal (Theorem 2.1.2), we obtain 


А-а -—a12  —a13  —a14 
0 А-а —a —a 
det(A- A)  =det 57 
0 0 А = азд  —da34 
0 0 0 А=ам 


= (A—a11)(à — a22) (А — a33) (А — ад) 
Thus, the characteristic equation is 
(A—a11)(à —a22) (A — a33) (A — ад) = 0 
and the eigenvalues are 


A=a11, À-—ag, А=а3, А=алдд 


which are precisely the diagonal entries of A. 


The following general theorem should be evident from the computations in the preceding example. 


THEOREM 5.1.2 


If A is an у x y triangular matrix (upper triangular, lower triangular, or diagonal), then the eigenvalues 
of A are the entries on the main diagonal of A. 


EXAMPLE 5 Eigenvalues of a Lower Triangular Matrix — 


By inspection, the eigenvalues of the lower triangular matrix 


5 0 0 
—|_1 2 
А=|-1 $ 0 

1 

5 -8 —] 


Had Theorem 5.1.2 been available earlier, we 
could have anticipated the result obtained in 
Example 2. 


THEOREM 5.1.3 


If A is апух y matrix, the following statements are equivalent. 

(a) Xis an eigenvalue of A. 

(b) The system of equations (АЈ — 4)х = 0 has nontrivial solutions. 
(c) There is a nonzero vector x such that 4x — Xx 

(d) Xis a solution of the characteristic equation det(A/ — А) = 0 


Finding Eigenvectors and Bases for Eigenspaces 


Now that we know how to find the eigenvalues of a matrix, we will consider the problem of finding the 
corresponding eigenvectors. Since the eigenvectors corresponding to an eigenvalue Д of a matrix A are the 
nonzero vectors that satisfy the equation 

(АЈ — А)х = 0 
these eigenvectors аге the nonzero vectors in the null space of ће matrix Ду — 4. We call this null space the 
eigenspace of A corresponding to A. Stated another way, the eigenspace of A corresponding to the eigenvalue 
Д is the solution space of the homogeneous system (XI — А)х = 0. 


Notice that x — () is in every eigenspace even 
though it is not an eigenvector. Thus, it is the 
nonzero vectors in an eigenspace that are the 

eigenvectors. 


EXAMPLE 6 Bases for Eigenspaces <4 


Find bases for the eigenspaces of the matrix 
з 0 
A= 
Solution In Example 1 we found the characteristic equation of A to be 


(A= 3)(A+ 1) 20 


from which we obtained the eigenvalues д — 3 and V — — 1. Thus, there are two eigenspaces 
of A, one corresponding to each of these eigenvalues. 


By definition, 


14] 


is an eigenvector of A corresponding to an eigenvalue A if and only if x is a nontrivial solution 


of (ÀJ — A)x = 0, that is, of 
А-3 0 xij |0 
-8 А+1||52| |0 


If å = 3, then this equation becomes 


ЫНЫ 


whose general solution is 


ХІ = at x2-—í 
(verify) or in matrix form, 
1 1 
М = |2 1-42 
t 1 
Thus, 
1 
2 
1 


is a basis for the eigenspace corresponding to д — 3. We leave it as an exercise for you to 
follow the pattern of these computations and show that 


H 


is a basis for the eigenspace corresponding to  — = 1. 


Historical Note Methods of linear algebra are used in the emerging field of computerized face 
recognition. Researchers are working with the idea that every human face in a racial group is a 
combination of a few dozen primary shapes. For example, by analyzing three-dimensional scans of 
many faces, researchers at Rockefeller University have produced both an average head shape in the 


Caucasian group—dubbed the meanhead (top row left in the figure to the left)—and a set of 
standardized variations from that shape, called eigenheads (15 of which are shown in the picture). 
These are so named because they are eigenvectors of a certain matrix that stores digitized facial 
information. Face shapes are represented mathematically as linear combinations of the eigenheads. 
[/mage: Courtesy Dr. Joseph Atick, Dr. Norman Redlich, and Dr. Paul Griffith] 


EXAMPLE 7 Eigenvectors and Bases for Eigenspaces — 


Find bases for the eigenspaces of 


Solution The characteristic equation of A is X? — 5X2 + 8X — 4 = Q, or in factored form, 
(A—1)(A— 2)? = 0 (verify). Thus, the distinct eigenvalues of A аге  — 1 and \ = 2, so there 


are two eigenspaces of A. 


By definition, 


х3 
is an eigenvector of A corresponding to A if and only if x is a nontrivial solution of 
(AJ — А)х = 0, or in matrix form, 
A 0 à X] 0 
=1 А-2 —1 ||#2|=|0 (5) 
-] 0 А—3/|^3 0 


In the case where } = 2, Formula 5 becomes 
20 2]||X1 0 
—1 0 —1||572|=|0 
=1 0 -1|[43 0 
Solving this system using Gaussian elimination yields (verify) 
ху= —5, X2—£, X3—8 


Thus, the eigenvectors of A corresponding to д = 2 are the nonzero vectors of the form 


—5 —< 0 —1 0 
х= ¿=| O]+]é]=s| O] +2) 1 
8 S 0 1 0 


Since 


—1 0 
О [апа | 1 
1 0 


are linearly independent (why?), these vectors form a basis for the eigenspace corresponding to 
А = 2. 


If å = 1, then 5 becomes 


1 0 2|[zi 0 
—1 —1 —1||^2|=|0 
-] 0 —2/|^3 0 
Solving this system yields (verify) 
X|— = 25, X2=5, X3=8 


Thus, the eigenvectors corresponding to 4 — 1 are the nonzero vectors of the form 


—2s —2 —2 
s|—s| 1 | so that 1 
5 1 1 


is a basis for the eigenspace corresponding to А = 1. 


Powers of a Matrix 


Once the eigenvalues and eigenvectors of a matrix А are found, it is a simple matter to find the eigenvalues 
and eigenvectors of any positive integer power of A; for example, if Д is an eigenvalue of A and x is a 
corresponding eigenvector, then 


А?х = A( Ах) = А(Ах) = А(Ах) = А(Ах) = А?х 
which shows that 4? is an eigenvalue of 42 and that x is a corresponding eigenvector. In general, we have the 
following result. 


THEOREM 5.1.4 


If k is a positive integer, Д is an eigenvalue of a matrix A, and x is a corresponding eigenvector, then 
A* is an eigenvalue of 4* and x is a corresponding eigenvector. 


EXAMPLE 8 Powers of a Matrix << 


In Example 7 we showed that the eigenvalues of 


00 -2 
А=|1 2 1 
10 3 


аге V — 2 and 4 = J, so from Theorem 5.1.4 both 4 — 27 — 128 and } — 17 = 1 аге eigenvalues of 
A’. We also showed that 


-1 0 
0| and |1 
1 0 


are eigenvectors of A corresponding to the eigenvalue д — 2, so from Theorem 5.1.4 they are also 
eigenvectors of 4? corresponding to д = 27 — 128. Similarly, the eigenvector 


—2 


of A corresponding to the eigenvalue } — 1 is also an eigenvector of А? corresponding to 
7 
Aml zl 


Eigenvalues and Invertibility 


The next theorem establishes a relationship between eigenvalues and the invertibility of a matrix. 


THEOREM 5.1.5 


A square matrix A is invertible if and only if — 0 is not an eigenvalue of A. 


Proof Assume that A is ап » x % matrix and observe first that Д — 0 is a solution of the characteristic 
equation 


M eq" 1 + +c, =0 
if and only if the constant term < у is zero. Thus, it suffices to prove that A is invertible if and only if cy # 0. 
But 


det(A — А) = M + |А 4 +c, 
or, on setting А = 0), 
det( —4) =c, or (—1)" det (A) —c,, 


It follows from the last equation that det(A) = 0 if and only if c,, = 0, and this in turn implies that A is 
invertible if and only if cy # 0. 


EXAMPLE 9 Eigenvalues and Invertibility + 


The matrix A in Example 7 is invertible since it has eigenvalues д — 1 and д — 2, neither of which 
is zero. We leave it for you to check this conclusion by showing that det(.4) # 0. 


More on the Equivalence Theorem 


As our final result in this section, we will use Theorem 5.1.5 to add one additional part to Theorem 4.10.4. 


THEOREM 5.1.6 Equivalent Statements 


If A is ап x х matrix, then the following statements are equivalent. 
(a) A is invertible. 

(b) Ах = 0 has only the trivial solution. 

(c) The reduced row echelon form of A is 7,,. 

(d) А is expressible as a product of elementary matrices. 

(е) Ах = h 15 consistent for every » x | matrix b. 

(f Ах = has exactly one solution for every » x 1 matrix b. 
(g) detí 4) #0. 

(h) The column vectors of A are linearly independent. 

(i) The row vectors of A are linearly independent. 

(j The column vectors of A span R”. 

(k) The row vectors of A span R”. 

(1) The column vectors of A form a basis for R”. 

(m) The row vectors of A form a basis for R”. 

(n) A has rank n- 

(0) A has nullity 0. 

(p) The orthogonal complement of the null space of A is 2”. 
(q) The orthogonal complement of the row space of A is {0}. 
(r) The range of F gis А”. 


(5) T 4is one-to-one. 


(1) Х= 0 1 пої ап eigenvalue of A. 


This theorem relates all of the major topics we have studied thus far. 


Concept Review 

* Eigenvector 

* Eigenvalue 

* Characteristic equation 

* Characteristic polynomial 

* Eigenspace 

* Equivalence Theorem 

Skills 

* Find the eigenvalues of a matrix. 


* Find bases for the eigenspaces of a matrix. 


Exercise Set 5.1 


In Exercises 1—2, confirm by multiplication that x 15 an eigenvector of А, and find the corresponding 
eigenvalue. 


Answer 
5 
2 2 =] —1 1 
А= | =1 2 =]|; х= |1 
=] -1 2 1 


3. Find the characteristic equations of the following matrices: 


(аә) |3 0 
8 =] 


(b [10 —9 
4 =p 


Answer: 


(а) А2—2А—3=0 
(b) A4 8A-- 16 —0 


(c) M —1220 
(d) M .-320 
(е) A^ —0 


(f) \7—-24+1=0 


. Find the eigenvalues of the matrices in Exercise 3 


5. Find bases for the eigenspaces of the matrices in Exercise 3 


Answer: 
(a) En 
Basis for eigenspace corresponding to À — 5:| 2 |; basis for eigenspace corresponding to 
1 
0 
H 
(b) 3 
Basis for eigenspace corresponding to À — 4:| 2 
1 
(c) B 
Basis for eigenspace corresponding to А = 12 ; {12 ; basis for eigenspace corresponding to 
1 
Ni 
А= — ү12: ; 12 
1 


(d) There are no eigenspaces. 


(e) Basis for eigenspace corresponding to А = 0: B | 


1 


о = © 
ц 


(f) Basis for eigenspace corresponding to À — 1: H | 


6. Find the characteristic equations of the following matrices: 
(a) 401 
—2 10 
—2 0 1 


(b) 


(d) 


2 bo 2 
0 —1 —8 
1 0 -2 


(f) 


| 
15 

FEN 

| 

| 

| 


7. Find ће eigenvalues of ће matrices in Exercise 6. 


Answer: 


(a) 1,2,3 
(b) -¥2,0, y2 
(c) 78 
(d) 2 
(e) 2 
(f) —4.3 
8. Find bases for the eigenspaces of the matrices in Exercise 6. 


9. Find the characteristic equations of the following matrices: 


(а) foo 20 
1207 Чуй 
(^35 0 
(5. 023 

œ) fio -9 о о 
da^ c d 
0 cem Sj 
Gc ow 4: 2 


Answer: 


(a) AHAS — 334 A220 
(b) A4 — 833 4-197 — 2444+ 48 =0 


10. Find the eigenvalues of the matrices in Exercise 9. 


11. Find bases for the eigenspaces of the matrices in Exercise 9. 


Answer: 
(a) 2] [o = 
з= : н MUT : 
0 1 0 
(b) 3 
2 
A=4:basis | 1 
0 
0 


‚ A= =—1:basis 


12. By inspection, find the eigenvalues of the following matrices: 


[300 
«2 
481 
(с) | 1 
; 000 
1 
0-7 0 0 
0 010 
1 
0 007 


13. Find the eigenvalues of 4? for 


оо ow 
SONI w 


Answer: 


9 
LY m cl caf 
1, | | =z}, 250 


—2 


1 
1 
0 


14. 


15. 


16. 


17. 


18. 


19. 


Find the eigenvalues and bases for the eigenspaces of 425 for 
=] -2 -2 
А=| 1 2 1 
-]-1 0 


Let A be a 2 x 2 matrix, and call a line through the origin of 22 invariant under A if Ax lies on the line 


when x does. Find equations for all lines in 22, if any, that are invariant under the given matrix. 


(a) as E Ei 


2- 1 

(b 4, 0 1 
4-|-1 o 
(с) ,_|2 3 
4-03] 


Answer: 


(a) y =x and y = 2x 

(b) No lines 

(c) y=0 

Find деї( 4) given that A has p(À) as its characteristic polynomial. 
(a) p(X) = АЗ —2M A 4-5 

(b) p) 2A - M +7 

[Hint: See the proof of Theorem 5.1.5.] 

Let A be an » ху matrix. 


(a) Prove that the characteristic polynomial of А has degree n. 


(b) Prove that the coefficient of А” in the characteristic polynomial is 1. 


Show that the characteristic equation of a 2 x 2 matrix А can be expressed as M tr(.4)A + det(A) = 0, 
where tr(.4) is the trace of A. 
a b 
A= 
MH 


then the solutions of the characteristic equation of A are 


= 3| @+4) + a - ay + abc] 


Use this result to show that А has 
(a) two distinct real eigenvalues if (a — d)? + 4bc > 0. 


Use the result in Exercise 18 to show that if 


(b) two repeated real eigenvalues if (a — d M H 4bc = 0. 


(c) complex conjugate eigenvalues if (a — d y? + 4bc < 0. 


20. Let A be the matrix in Exercise 19. Show that if b #0, then 


_| = oe =b 
t= a= Д AA a= А 


are eigenvectors of A that correspond, respectively, to the eigenvalues 


x-i|« на) + ү (аа) Abe | 


and 
= (а Fd) – ү (а —4)?+ Abc | 
21. Use the result of Exercise 18 to prove that if p (À) is the characteristic polynomial of a 2 x 2 matrix A, 
then p(A) = 0. 
22. Prove: If a, b, c, and d are integers such that g -+ b = c + d, then 


23. 


24. 


25. 


26. 


27. 


28. 


= 


has integer eigenvalues—namely, Ау = a + Ё andAz=a—c. 


Prove: If A is an eigenvalue of an invertible matrix A, and x is a corresponding eigenvector, then 1 / X is 
an eigenvalue of 4 —!, and x is a corresponding eigenvector. 


Prove: If A is an eigenvalue of A, x is a corresponding eigenvector, and s is a scalar, then Д — = is an 
eigenvalue of А — =7, and x is a corresponding eigenvector. 


Prove: If X is an eigenvalue of A and x is a corresponding eigenvector, then s) is an eigenvalue of g4 for 
every scalar s, and x is a corresponding eigenvector. 


Find the eigenvalues and bases for the eigenspaces of 


m 

Il 

| 

Кю 
Now PO 
Uh о шю 


and then use Exercises 23 and 24 to find the eigenvalues and bases for the eigenspaces of 
(а) A71 

(Ы) A= 37 

(с) 4+ 27 


(a) Prove that if А is a square matrix, then А and 47 have the same eigenvalues. [Hint: Look at ће 
characteristic equationdet(A/ — А) = 0.] 

(b) Show that 4 and 47 need not have the same eigenspaces. [Hint: Use the result in Exercise 20 to find 
а 2 2 matrix for which А and 47 have different eigenspaces.] 


Suppose that the characteristic polynomial of some matrix А is found to be 

p(A) = (A= 1)(А — 3) 2 (Х—4) 3 In each part, answer the question and explain your reasoning. 
(a) What is the size of A? 

(b) Is A invertible? 


(c) How many eigenspaces does А have? 


29. The eigenvectors that we have been studying are sometimes called right eigenvectors to distinguish them 
from left eigenvectors, which are y x 1 column matrices x that satisfy the equation x A= px? for some 


scalar j4. What is the relationship, if any, between the right eigenvectors and corresponding eigenvalues А 
of A and the left eigenvectors and corresponding eigenvalues ш of A? 


True-False Exercises 

In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 

(a) If A is a square matrix and 4x = Xx for some nonzero scalar A, then x is an eigenvector of A. 
Answer: 


False 


(b) If À is an eigenvalue of a matrix A, then the linear system (AJ — A)x = 0 has only the trivial solution. 
Answer: 


False 


(c) If the characteristic polynomial of a matrix A is p(À) = A? 4-1, then A is invertible. 


Answer: 


True 


(d) If \ is an eigenvalue of a matrix A, then the eigenspace of A corresponding to Д is the set of eigenvectors 
of A corresponding to A. 


Answer: 


False 


(e) If 0 is an eigenvalue of a matrix A, then 4? is singular. 


Answer: 


True 


(f) The eigenvalues of a matrix A are the same as the eigenvalues of the reduced row echelon form of A. 
Answer: 


False 


(g) If 0 is an eigenvalue of a matrix A, then the set of columns of A is linearly independent. 
Answer: 


False 
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5.2 Diagonalization 


In this section we will be concerned with the problem of finding a basis for R” that consists of eigenvectors of an 


п x n matrix A. Such bases can be used to study geometric properties of A and to simplify various numerical 
computations. These bases are also of physical significance in a wide variety of applications, some of which will be 
considered later in this text. 


The Matrix Diagonalization Problem 


Our first objective in this section is to show that the following two seemingly different problems are equivalent. 


Problem 1 Given an » x » matrix A, does there exist an invertible matrix P such that Р -l AP is diagonal? 


Problem 2 Given an » x » matrix A, does A have n linearly independent eigenvectors? 


Similarity 


The matrix product P 1 4р that appears in Problem 1 is called a similarity transformation of the matrix А. Such 
products are important in the study of eigenvectors and eigenvalues, so we will begin with some terminology about 
them. 


DEFINITION 1 


If А and B are square matrices, then we say that B is similar to A if there is an invertible matrix P such that 
–1 
В=Р AF. 


Note that if В is similar to А, then it is also true that А is similar to B, since we can express B as 8 = Q "Lan by 
taking Q — P — This being the case, we will usually say that 4 and B are similar matrices if either is similar to 


the other. 


Similarity Invariants 


Similar matrices have many properties in common. For example, if p — Р! Др, then it follows that А and B have 


the same determinant, since 


det(B) = dei(P 1 4P) = dei(P }det( A) det(P) 
day 00) det(P) = det(A) 


In general, any property that is shared by all similar matrices is called a similarity invariant or is said to be 
invariant under similarity. Table 1 lists the most important similarity invariants. The proofs of some of these 
results are given as exercises. 


Property 
Determinant 
Invertibility 
Rank 
Nullity 
Trace 


Characteristic 
polynomial 


Eigenvalues 


Eigenspace 
dimension 


Table 1 Similarity Invariants 
Description 
A and рт! Др have the same determinant. 
A is invertible if and only if P -l AP is invertible. 
A and рт! др have the same rank. 
A and рт! др have the same nullity. 
A and P1 др have the same trace. 


A and рт! др have the same characteristic polynomial. 


A and рт! др have the same eigenvalues. 


If A is an eigenvalue of A and hence of P -1 AP, then the eigenspace of A 
corresponding to Д and the eigenspace of р ^! др corresponding to Д have the same 


dimension. 


Expressed in the language of similarity, Problem 1 posed above is equivalent to asking whether the matrix A is 
similar to a diagonal matrix. If so, the diagonal matrix will have all of the similarity-invariant properties of А, but 
will have a simpler form, making it easier to analyze and work with. This important idea has some associated 


terminology. 


DEFINITION 2 


A square matrix А 15 said to be diagonalizable if it is similar to some diagonal matrix; that is, if there exists 
an invertible matrix P such that P ^1 др is diagonal. In this case the matrix P is said to diagonalize A. 


The following theorem shows that Problems 1 and 2 posed above are actually two different forms of the same 


mathematical problem. 


THEOREM 5.2.1 


If A is an y x; х matrix, the following statements are equivalent. 


(a) A is diagonalizable. 


(b) A has n linearly independent eigenvectors. 


Part (b) of Theorem 5.2.1 is equivalent to saying 
that there is a basis for R” consisting of 
eigenvectors of А. Why? 


Proof (a) (b) Since A is assumed to be diagonalizable, it follows that there exists an invertible matrix P and a 
diagonal matrix D such that p ^1 Ap = D or, equivalently, 


AP=PD (1) 


If we denote the column vectors of P by p1, P2, --- Py, and if we assume that the diagonal entries of D are 
Ду, Аз, ..., Ay, then by Formula 6 of Section 1.3 the left side of 1 can be expressed as 


AP=Alp, рз ... Pn] =[4p1 4р2 .. Apn] 
and, as noted in the comment following Example 1 of Section 1.7, the right side of 1 can be expressed as 


РР = [Ар] Азр2 .. Apa] 
Thus, it follows from 1 that 


Ару =A1p1, Ap2=Azp2...... Ар, = Ауру (2) 


Since Р is invertible, we know from Theorem 5.1.6 that its column vectors ру, рз, ..., Py are linearly independent 
(and hence nonzero). Thus, it follows from 2 that these n column vectors are eigenvectors of A. 


Proof (b) (a) Assume that A has n linearly independent eigenvectors, ру, рз, ..., Py, and that Ay, Аз, ..., Ay are 
the corresponding eigenvalues. If we let 


P= [Pi P2 .. ри] 
and if we let D be the diagonal matrix that has Aj, Аз, ..., Àj, as its successive diagonal entries, then 


AP = Alp, рз ... р„]=[Ару Арг .. Apn] 
= [Ару Азр2 .. Аһрь] =PD 


Since the column vectors of P are linearly independent, it follows from Theorem 5.1.6 that P is invertible, so that 
this last equation can be rewritten as р! Др — p, which shows that A is diagonalizable. 


Procedure for Diagonalizing a Matrix 


The preceding theorem guarantees that an y x з matrix A with n linearly independent eigenvectors is 
diagonalizable, and the proof suggests the following method for diagonalizing A. 


Procedure for Diagonalizing a Matrix 


Step 1. Confirm that the matrix is actually diagonalizable by finding n linearly independent eigenvectors. 
One way to do this is by finding a basis for each eigenspace and merging these basis vectors into a single 
set S. If this set has fewer than n vectors, then the matrix is not diagonalizable. 


Step 2. Form the matrix P = [ру pa ... pj] that has the vectors in S as its column vectors. 
Step 3. The matrix p~! др will be diagonal and have the eigenvalues Ду, Az, ..., А, corresponding to the 


eigenvectors ру, P2, ..., pj as its successive diagonal entries. 


EXAMPLE 1 Finding a Matrix P That Diagonalizes a Matrix A + 


Find a matrix P that diagonalizes 


Solution In Example 7 of the preceding section we found the characteristic equation of A to be 
(A—1)(A—2)7=0 


and we found the following bases for the eigenspaces: 


=| 0 —2 
А=2: р =| Of, р2= |1 |; А=1: p= 
1 0 


There are three basis vectors in total, so the matrix 


=1 0 -2 
P=| 01 1 
10 1 


diagonalizes A. As a check, you should verify that 
10 2| 00 -2|-10 -2 2 

Plap-| 11 1| 12 1| 01 1|=|о 

-10 -1[|[210 3j 1 0 0 


oro 
се © © 


In general, there is no preferred order for the columns of P. Since the ith diagonal entry of P 1 АР is an eigenvalue 
for the ith column vector of P, changing the order of the columns of P just changes the order of the eigenvalues on 
the diagonal of p 1 4р. Thus, had we written 


in the preceding example, we would have obtained 


200 
PlaP-|010 
002 


EXAMPLE 2 A Matrix That Is Not Diagonalizable — 


Find a matrix P that diagonalizes 


Solution The characteristic polynomial of A is 


A1 0 0 
4е(М—А)у=|—1 A-2 0 |=(A—1)(A-2)? 
3 —5 4-2 


so the characteristic equation is 
(A—1)(A—2)7 =0 


Thus, the distinct eigenvalues of A are V = | and д — 2. We leave it for you to show that bases for 
the eigenspaces are 


1 

8 0 
А=1: pr=|_1]; A=2: p2-|0 

8 1 

1 


Since A is a 3 x 3 matrix and there are only two basis vectors in total, A is not diagonalizable. 


Alternative Solution If you are concerned only in determining whether a matrix is 
diagonalizable and not with actually finding a diagonalizing matrix P, then it is not necessary to 
compute bases for the eigenspaces—it suffices to find the dimensions of the eigenspaces. For this 
example, the eigenspace corresponding to А — 1 is the solution space of the system 
0 O Off *1 0 
-] -1 0х2 |= |0 
3 —5 —1||^3 0 


Since the coefficient matrix has rank 2 (verify), the nullity of this matrix is 1 by Theorem 4.8.2, and 
hence the eigenspace corresponding to Д — | is one-dimensional. 


The eigenspace corresponding to д = 2 is the solution space of the system 
1 0 0|[xi 0 
-] 0 0х2 |= 0 
3 —5 0 |13 0 
This coefficient matrix also has rank 2 and nullity 1 (verify), so ће eigenspace corresponding to 


A= 2 is also one-dimensional. Since the eigenspaces produce a total of two basis vectors, and since 
three are needed, the matrix A is not diagonalizable. 


There is an assumption in Example 1 that the column vectors of P, which are made up of basis vectors from the 
various eigenspaces of A, are linearly independent. The following theorem, proved at the end of this section, shows 
that this is so. 


THEOREM 5.2.2 


If v1, v3, ..., Уң are eigenvectors of a matrix A corresponding to distinct eigenvalues, then 
(v1, V2,.., Vk) is a linearly independent set. 


Remark Theorem 5.2.2 is a special case of a more general result: Suppose that Aj, Ag, ..., Ад are distinct 
eigenvalues and that we choose a linearly independent set in each of the corresponding eigenspaces. If we then 
merge all these vectors into a single set, the result will still be a linearly independent set. For example, if we choose 
three linearly independent vectors from one eigenspace and two linearly independent vectors from another 
eigenspace, then the five vectors together form a linearly independent set. We omit the proof. 


As a consequence of Theorem 5.2.2, we obtain the following important result. 


THEOREM 5.2.3 


If an у x д matrix A has n distinct eigenvalues, then A is diagonalizable. 


Proof Ifw;, уз, ..., уу, are eigenvectors corresponding to the distinct eigenvalues Ат, Az, ..., Ay, then by Theorem 
5.2.2, v1, V3, ..., Vy are linearly independent. Thus, А is diagonalizable by Theorem 5.2.1. 


EXAMPLE 3 Using Theorem 5.2.3 < 


We saw in Example 3 of the preceding section that 


0 1 0 
A=|0 0 1 
4 —17 8 
has three distinct eigenvalues: V = 4, А = 2 4 үз, and å =? = үз. Therefore, A is diagonalizable 
and 
4 0 0 


pdap-|0 2443 0 
0 0 2-03 


for some invertible matrix P. If needed, the matrix P can be found using the method shown in 
Example 1 of this section. 


EXAMPLE 4 Diagonalizability of Triangular Matrices + 


From Theorem 5.1.2, the eigenvalues of a triangular matrix are the entries on its main diagonal. 
Thus, a triangular matrix with distinct entries on the main diagonal is diagonalizable. For example, 


-124 0 
031 7 
A= 
005 8 
000 -2 
is a diagonalizable matrix with eigenvalues Ay = = 1, Аз = 3, Аз = 5, À4— = 2. 


Computing Powers of a Matrix 


There are many applications in which it is necessary to compute high powers of a square matrix А. We will show 
next that if 4 happens to be diagonalizable, then the computations can be simplified by diagonalizing A. 


To start, suppose that А is a diagonalizable » x »; matrix, that P diagonalizes A, and that 


A 0 .. 0 
piapa|?9 € 91325 
0 0 .. Ay 
Squaring both sides of this equation yields 
A 0 0 
(Par?) 0 X. 0 j= 
оо. А 


We can rewrite the left side of this equation as 
dust ucl snl -1 EM 
(P AP) =P APP APP AIAP-P AP 


from which we obtain the relationship P ^1 42р = DŻ. More generally, if k is a positive integer, then a similar 


computation will show that 


o 
o 
% 


which we can rewrite as 


(3) 


Formula 3 reveals that raising a diagonalizable 
matrix А to a positive integer power has the effect 
of raising its eigenvalues to that power. 


Note that computing the right side of this formula involves only three matrix multiplications and the powers of the 


diagonal entries of D. For matrices of large size and high powers of Д, this involves substantially fewer operations 
than computing 4* directly. 


EXAMPLE 5 Power of a Matrix << 


Use 3 to find AD, where 


Solution We showed in Example 1 that the matrix A is diagonalized by 


—1 0 -2 
P=| 01 1 
10 1 


and that 


Thus, it follows from 3 that 


-10 -27/2% o o0 10 2 
APBZ2ppPBpi = | 01 11| о 23 o 11 1 
10 l|g о 1311-10 - 


(4) 
—8190 0 —16382 


8191 8192 8191 
8191 0 16383 


Remark With the method in the preceding example, most of the work is in diagonalizing А. Once that work is 
done, it can be used to compute any power of A. Thus, to compute 41000 we need only change the exponents from 
13 to 1000 in 4. 


Eigenvalues of Powers of a Matrix 


Once the eigenvalues and eigenvectors of any square matrix А are found, it is a simple matter to find the 
eigenvalues and eigenvectors of any positive integer power of A. For example, if A is an eigenvalue of A and x is a 
corresponding eigenvector, then 


A*x = A(Ax) = А(Ах) = (Ax) = А(Ах) = А?х 
which shows not only that Д2 is an eigenvalue of 42 but that x is a corresponding eigenvector. In general, we have 


the following result. 


Note that diagonalizability is not a requirement in 
Theorem 5.2.4. 


THEOREM 5.2.4 


If A is an eigenvalue of a square matrix А and x is a corresponding eigenvector, and if k is any positive 
integer, then д is an eigenvalue of 4“ and x is a corresponding eigenvector. 


Some problems that use this theorem are given in the exercises. 


Geometric and Algebraic Multiplicity 


Theorem 5.2.3 does not completely settle the diagonalizability question since it only guarantees that a square 
matrix with n distinct eigenvalues is diagonalizable, but does not preclude the possibility that there may exist 
diagonalizable matrices with fewer than n distinct eigenvalues. The following example shows that this is indeed the 
case. 


EXAMPLE 6 The Converse of Theorem 5.2.3 Is False -* 


Consider the matrices 


100 110 
i—|0 1 0| and J=/0 1 1 
00 1 00 1 


It follows from Theorem 5.1.2 that both of these matrices have only one distinct eigenvalue, namely 
А = 1, and hence only one eigenspace. We leave it as an exercise for you to solve the characteristic 


equations 

(A — Г)х = 0 and (AJ —Dx-—0 
with А = 1 and show that for J the eigenspace is three-dimensional (all of 523) and for J it is 
one-dimensional, consisting of all scalar multiples of 


This shows that the converse of Theorem 5.2.3 is false, since we have produced two 3 x 3 matrices 
with fewer than three distinct eigenvalues, one of which is diagonalizable and the other of which is 
not. 


A full excursion into the study of diagonalizability is left for more advanced courses, but we will touch on one 
theorem that is important to a fuller understanding of diagonalizability. It can be proved that if Ап is an eigenvalue 
of A, then the dimension of the eigenspace corresponding to Ag cannot exceed the number of times that A — Ag 
appears as a factor of the characteristic polynomial of А. For example, in Example 1 and Example 2 the 
characteristic polynomial is 


(As Dies 27" 


Thus, the eigenspace corresponding to Д = | is at most (hence exactly) one-dimensional, and the eigenspace 
corresponding to А = 2 is at most two-dimensional. In Example 1 the eigenspace corresponding to д = 2 actually 
had dimension 2, resulting in diagonalizability, but in Example 2 the eigenspace corresponding to \ = 2 had only 
dimension 1, resulting in nondiagonalizability. 


There is some terminology that is related to these ideas. If Ag is an eigenvalue of an » x » matrix A, then the 
dimension of the eigenspace corresponding to Ag is called the geometric multiplicity of Ар, and the number of 
times that А — Ag appears as a factor in the characteristic polynomial of A is called the algebraic multiplicity of Ао. 
The following theorem, which we state without proof, summarizes the preceding discussion. 


THEOREM 5.2.5 Geometric and Algebraic Multiplicity 


If A is a square matrix, then: 
(a) For every eigenvalue of A, the geometric multiplicity is less than or equal to the algebraic multiplicity. 


(b) Ais diagonalizable if and only if the geometric multiplicity of every eigenvalue is equal to the 
algebraic multiplicity. 


OPTIONAL 


We will complete this section with an optional proof of Theorem 5.2.2. 


Proof of Theorem 5.2.2 Let v4, v3, ..., vk be eigenvectors of A corresponding to distinct eigenvalues 
Ду, Аз, ..., Ар. We will assume that v, уз, ..., vj, are linearly dependent and obtain a contradiction. We can then 
conclude that v, v5, ..., v; are linearly independent. 


Since an eigenvector is nonzero by definition, (v4) is linearly independent. Let r be the largest integer such that 
(v1, V3, -- V.) Is linearly independent. Since we are assuming that (v4, v2, ..., vk) is linearly dependent, r 
satisfies 1 < ғ < А. Moreover, by the definition ofr, (v1, v2, ..., v4.1) is linearly dependent. Thus, there are 
scalars c1, c2, ...,Cy4 4, not all zero, such that 


сүү + €2V2 +... + C41 ¥~41 = 0 


Multiplying both sides of 5 by A and using the fact that 
Av; = Ау, Аулз=Азуз,..‚ Азур = А11 


we obtain 
C1À1V + €2À2v2 +... €. 119, 11v = 0 


If we now multiply both sides of 5 by А; 4. and subtract the resulting equation from 6 we obtain 
c1 (à — Arp) Vi F с2(А2 — Аур1)%2 +... (Ау = M44)v; = 0 
Since (v4, V2, ..., V.) is a linearly independent set, this equation implies that 
СА = А1) =C2(A2 7 А1) =... = Cp Ay — А1) = 0 


and since Ду, Ag, ..., À.., 4 are assumed to be distinct, it follows that 


Cj =cz=...=c,=0 


Substituting these values in 5 yields 
Cr+i¥r41 = 0 
Since the eigenvector ¥r+1 is nonzero, it follows that 


су+1 = 0 


But equations 7 and 8 contradict the fact that c, c2, ..., ср] are not all zero so the proof is complete. 


Concept Review 

* Similarity transformation 
* Similarity invariant 

* Similar matrices 

* Diagonalizable matrix 

* Geometric multiplicity 


* Algebraic multiplicity 


Skills 

* Determine whether a square matrix А is diagonalizable. 
* Diagonalize a square matrix A. 

* Find powers of a matrix using similarity. 


* Find the geometric multiplicity and the algebraic multiplicity of an eigenvalue. 


(5) 


(6) 


(7) 


(8) 


Exercise Set 5.2 


In Exercises 1—4, show that А and B are not similar matrices. 


ia [i3] az [f 0 
= 13 2813 2 


Answer: 


Possible reason: Determinants are different. 


Answer: 


S. Let A bea 6 x 6 matrix with characteristic equation \2 (А – 1)(A- 2)3 = 0. What are the possible dimensions 
for eigenspaces of А? 
Answer: 


AÀ—0:1or2; А= 1:1; A=2:1,2, or? 
6. Let 


O Ww o 
+ м н 


(a) Find the eigenvalues of A. 
(b) For each eigenvalue A, find the rank of the matrix M = А. 


(c) Is A diagonalizable? Justify your conclusion. 
In Exercises 7-11, use the method of Exercise 6 to determine whether the matrix is diagonalizable. 
7. É 1 
12 
Answer: 


Not diagonalizable 


Answer: 


Not diagonalizable 


10.[-1 0 1 
=1 3 0 
=4 13 m1 


11.|2 —-10 1 
0 2 1 -1 
Dn 02 
0 00 3 


Answer: 


Not diagonalizable 


In Exercises 12-15, find a matrix P that diagonalizes A, and compute P -l 4p. 


am [TX Я 


—20 17 
13. 1 0 
А= 
HE 
Answer: 
Lag { i ù 
Р ==| 3 pa =| ] 
1 1 B 
14. 100 
A=/0 1 1 
0 1 1 
15. 20 -2 
А=|0 3 0 
0 0 3 
Answer 


0 1 300 
P=| 010[|,72!4P-|030 
100 002 


In Exercises 16—21, find the geometric and algebraic multiplicity of each eigenvalue of the matrix А, and 
determine whether A is diagonalizable. If A is diagonalizable, then find a matrix P that diagonalizes A, and find 


pap. 


Answer: 


Answer: 


(cc: | 
ге еч оса 
о о о м | 
о осе с onmo 
сше ЕЕ 
са о о о сы о оо 
| | 
4 | 

| | 

ч ч 
= - 
A ез 


Answer: 


0 0 
0 0 
3 0 
0 3 


0 
—2 
0 
0 


22. Use the method of Example 5 to compute 410, where 


23. Use the method of Example 5 to compute 411, where 


-] 7 =1 
А=| 0 1 0 


0 15 —2 
Answer: 
=] 10237 —2047 
0 1 0 
0 10245 —2048 
24. In each part, compute the stated power of 

1-2 8 
A—|0-1 0 
0 0 =l 


(a) 241000 (b) 471000 (с) 47901 (d) А?! 


25. Find A" if n is a positive integer and 


3-1 0 
Ass Lem mes 
D. 3 
Answer: 
do „Т 
1 1 111" o olf 3 6 
A"-pPp'Pl-|2 o -1|| о з" 0 i 0 -i 
DOSE MPO i ek ce +] 
3 MEE 
26. Let 
b 
А=|° 
|; a 
Show that 


(a) A is diagonalizable if (a — d у? + Abc > 0. 
(b) A is not diagonalizable if (a — d E F 4be « 0. 
[Hint: See Exercise 19 of Section 5.1.] 
27. In the case where the matrix A in Exercise 26 is diagonalizable, find a matrix P that diagonalizes A. [Hint: See 
Exercise 20 of Section 5.1.] 


Answer: 


=) =) 


On possibility is P = p -Àj a=- 


| where Ay and Аз are as in Exercise 20 of Section 5.1. 


28. Prove that similar matrices have the same rank. 


29. Prove that similar matrices have the same nullity. 


30. 
31. 


32. 
33. 


34. 


Prove that similar matrices have the same trace. 

Prove that if А is diagonalizable, then so is 4% for every positive integer k. 

Prove that if А is a diagonalizable matrix, then the rank of A is the number of nonzero eigenvalues of A. 
Suppose that the characteristic polynomial of some matrix A is found to be p(À) = (A= 1)(À = 3) у —4) a 
In each part, answer the question and explain your reasoning. 

(a) What can you say about the dimensions of the eigenspaces of А? 

(b) What can you say about the dimensions of the eigenspaces if you know that A is diagonalizable? 


(c) If {¥1, v3, v3) is a linearly independent set of eigenvectors of A all of which correspond to the same 
eigenvalue of 4, what can you say about the eigenvalue? 


Answer: 


(a) A= l:dimension = 1; A=3:dimension < 2; A=4:dimension < 3 

(b) Dimensions will be exactly 1, 2, and 3. 

(c) A=4 

This problem will lead you through a proof of the fact that the algebraic multiplicity of an eigenvalue of an 


п x n Matrix A is greater than or equal to the geometric multiplicity. For this purpose, assume that Ag is an 
eigenvalue with geometric multiplicity k. 


(a) Prove that there is a basis B= (uj, u2,..., uy) for R” in which the first k vectors of B form a basis for the 
eigenspace corresponding to Ag. 


(b) Let P be the matrix having the vectors in B as columns. Prove that the product АР can be expressed as 


АР=Р мї X 
0 F 


[Hint: Compare the first k column vectors on both sides.] 
(c) Use the result in part (b) to prove that А is similar to 


C= мї A 
0 FY 


and hence that А and C have the same characteristic polynomial. 


(d) By considering det(A/ — C), prove that the characteristic polynomial of C (and hence A) contains the 
factor (А — Ag) at least k times, thereby proving that the algebraic multiplicity of Ag is greater than or equal 
to the geometric multiplicity К. 


True-False Exercises 


In parts (a)-(h) determine whether the statement 15 true or false, and justify your answer. 


(a) Every square matrix is similar to itself. 


Answer: 


True 


(b) If A, B, and C are matrices for which А is similar to В and B is similar to C, then A is similar to C. 


Answer: 


True 


(c) If A and B are similar invertible matrices, then 47! and p 1 are similar. 


Answer: 


True 


(d) If A is diagonalizable, then there is a unique matrix P such that P 1 АР is diagonal. 


Answer: 


False 


(е) If A is diagonalizable and invertible, then 4 71 is diagonalizable. 
Answer: 


True 


(f) If A is diagonalizable, then 47 is diagonalizable. 
Answer: 


True 


(g) If there is a basis for R” consisting of eigenvectors of an у x д matrix A, then А is diagonalizable. 
Answer: 


True 


(h) If every eigenvalue of a matrix A has algebraic multiplicity 1, then A is diagonalizable. 
Answer: 


True 
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5.3 Complex Vector Spaces 


Because the characteristic equation of any square matrix can have complex solutions, the notions of complex eigenvalues and 
eigenvectors arise naturally, even within the context of matrices with real entries. In this section we will discuss this idea and 
apply our results to study symmetric matrices in more detail. A review of the essentials of complex numbers appears in the 
back of this text. 


Review of Complex Numbers 


Recall that if z= æ + bi is a complex number, then: 


* Re(z) = а and Im(z) = 5 are called the real part of z and the imaginary part of z, respectively, 
& : = Va? +b 2 is called the modulus (or absolute value) of z, 


* Z= а = bi is called the complex conjugate of z, 


* zz — a? | =}, 


* the angle ф in Figure 5.3.1 is called an argument of z, 
„ Re(z) = f| cos ġ 
» Im(z) = p| sm ġ 


* z= E|(cos ф + isin @) is called the polar form ofz. 


b ї=а+ bi 
Im(z)=b $-—————— 


Figure 5.3.1 


Complex Eigenvalues 
In Formula 3 of Section 5.1 we observed that the characteristic equation of a general » x з matrix A has the form 
M oa +...++с„=0 (1) 


in which the highest power of X has a coefficient of 1. Up to now we have limited our discussion to matrices in which the 
solutions of 1 are real numbers. However, it is possible for the characteristic equation of a matrix A with real entries to have 
imaginary solutions; for example, the characteristic equation of the matrix 


(37 


A+2 1 
—5 A-2 
which has the imaginary solutions д = ; and 4 = — ;. To deal with this case we will need to explore the notion of a complex 
vector space and some related ideas. 


=x Ł1=0 


Vectors in C" 


A vector space in which scalars are allowed to be complex numbers is called a complex vector space. In this section we will 
be concerned only with the following complex generalization of the real vector space R”. 


DEFINITION 1 


If п is a positive integer, then a complex n-tuple is a sequence of n complex numbers (v, v5, ..., Ум). The set of all 
complex n-tuples is called complex n-space and is denoted by C”. Scalars are complex numbers, and the operations 
of addition, subtraction, and scalar multiplication are performed componentwise. 


The terminology used for n-tuples of real numbers applies to complex n-tuples without change. Thus, if v, v3, ..., Уу are 
complex numbers, then we call v = (v1, v2, -- v4) a vector in C" and v1, v3, ..., Уу its components. Some examples of 
vectors in (73 are 


u=(1+i, -4,3-2), v=(0,i,5), w= (6 — (2,9 | Ji mi) 


Every vector 

v= (V1, V3, .., Ум) = (01 + bii; а2 + bai, Las + by) 
in C” can be split into real and imaginary parts as 

у= (41, 42, ..., An) t i(b1, 22, ..., by) 
which we also denote as 
v — Re(v) -- i Im(v) 
where 
Re(v)—(a1,a2,..,a5) and (ә) = (41, 5, -~ By) 

The vector 

у= (V1, V2, -n Vy) = (ay = 1i, а) — Èi, -.., y — by) 


is called the complex conjugate of v and can be expressed in terms of Re(v) and Im(v) as 
у= (a1, 42, ---, dy) = i(bi, 52, M by) = Re(v) =i Im(v) (2) 


It follows that the vectors іп R” can be viewed as those vectors іп Œ” whose imaginary part is zero; or stated another way, a 
vector v in C” is in R” if and only if v = v. 


In this section we will also need to consider matrices with complex entries, so henceforth we will call a matrix А a real matrix 
if its entries are required to be real numbers and a complex matrix if its entries are allowed to be complex numbers. The 
standard operations on real matrices carry over to complex matrices without change, and all of the familiar properties of 
matrices continue to hold. 


If A is a complex matrix, then Re(A) and Im(A) are the matrices formed from the real and imaginary parts of the entries of A, 
and 4 is the matrix formed by taking the complex conjugate of each entry in A. 


EXAMPLE 1 Real and Imaginary Parts of Vectors and Matrices — 


Let 


: ; l4i =i 
= (354i, —2,5 d А= 
у= (3+1 1,5) ап Е "sd 


Then 
v—(3—i 21,5), Re(v)—(5,0,5) Im(v)—(1, —2,0) 


= fine 3 10 LES o 
à-| 4 5e Re(A) =, jJ in = |, 2 


—(140(6—2)0—(-—i1(4)—8--8i 


-i 
4 6—2 


Algebraic Properties of the Complex Conjugate 


The next two theorems list some properties of complex vectors and matrices that we will need in this section. Some of the 
proofs are given as exercises. 


THEOREM 5.3.1 


If u and v are vectors in C”, and if k is a scalar, then: 
(а) = и 

(b) ku = ku 

(с) ut v—-u-cv 


(d u—v—u-v 


THEOREM 5.3.2 


If A is an jj; x Дд complex matrix and B is a & x » complex matrix, then: 


(a) А-А 


The Complex Euclidean Inner Product 


The following definition extends the notions of dot product and norm to C”. 


DEFINITION 2 


Ifu = (11, 12, .... ux) and v = (v, v2, -- Vy) are vectors in C”, then the complex Euclidean inner product of of u 
and v (also called the complex dot product) is denoted by үү. y and is defined as 


оуу + u3v2 H... F uyy (3) 


We also define the Euclidean norm on С' to be 


Ivl = vv Yi + va? +--+ Pal? (4) 


As in the real case, we call v a unit vector in C" if ||v|| = 1, and we say two vectors u and v are orthogonal if y · ү = 0. 
The complex conjugates in 3 ensure that ||v|| is a real 


number, for without them the quantity y . y in 4 might 
be imaginary. 


EXAMPLE 2 Complex Euclidean Inner Product and Norm + 


Find y - y. y - w ||ull, and ||v|| for the vectors 
u—í(l-4ii3-—i) and v—(l-4i 2,40) 


Solution 


u:v— (12-2 (T+) +i(2)+ G7 (8) = (014-20 71) + 2+ (3-7 )(- 4i) = -2- 10i 


v:uc (10 (142) 2) + 4) 8-1) = (0101-2) — 24-413 +i) = — 2-4 10i 
lal = Jt +a]? + il + pig = V2 14 10 = уз 
vll = ү +a? + 22+ ail = 024 16 = {22 


=(1 
=(1 


Recall from Table 1 of Section 3.2 that if u and v are column vectors in R”, then their dot product can be expressed as 
оуу = угу 
The analogous formulas in C7" are (verify) 


о-у = ўа (5) 


Example 2 reveals a major difference between the dot product on R” and the complex dot product оп C”. For ће dot product 
on R” we always have ү. u = цу. y (the symmetry property), but for the complex dot product the corresponding relationship is 
given byu: v = v * u, which is called its antisymmetry property. The following theorem is an analog of Theorem 3.2.2. 


THEOREM 5.3.3 


If u, v, and w are vectors in C”, and if k is a scalar, then the complex Euclidean inner product has the following 
properties: 

(а) u y= [Antisymmetry property] 

(6) ч" (v--w) =u'v+u'w [Distributive property] 

(с) kiu: v) = (ku): v [Homogeneity property] 


(d) u: kv = k(u - v) [Antihomogeneity property] 
(e) v-v>Oand v-v=O0if andonly if v=0. [Positivity property] 


Parts (c) and (d) of this theorem state that a scalar multiplying a complex Euclidean inner product can be regrouped with the 
first vector, but to regroup it with the second vector you must first take its complex conjugate. We will prove part (d), and 
leave the others as exercises. 


Proof (d) 


(азу) = k(v-u) = k(v-u) = (у ш) = (ev):u—u- (ev) 


To complete the proof. substitute i: for k and use the fact that i = x. 


Vector Concepts in C" 


Except for the use of complex scalars, the notions of linear combination, linear independence, subspace, spanning, basis, and 
dimension carry over without change to C”. 


Is R" a subspace of С??? Explain. 
Eigenvalues and eigenvectors are defined for complex matrices exactly as for real matrices. If A is an у x д matrix with 
complex entries, then the complex roots of the characteristic equation det(A? — 4) = 0 are called complex eigenvalues of A. 
As in the real case, Д is a complex eigenvalue of A if and only if there exists a nonzero vector x in Œ” such that Ax = Ax. 
Each such x is called a complex eigenvector of A corresponding to à. The complex eigenvectors of A corresponding to À are 
the nonzero solutions of the linear system (AZ — А)х = 0, and the set of all such solutions is a subspace of С”, called the 


eigenspace of A corresponding to À. 


The following theorem states that if a real matrix has complex eigenvalues, then those eigenvalues and their corresponding 
eigenvectors occur in conjugate pairs. 


THEOREM 5.3.4 


If À is an eigenvalue of a real »; x з matrix A, and if x is a corresponding eigenvector, then Д is also an eigenvalue of A, 
and x is a corresponding eigenvector. 


Proof Since Ais an eigenvalue of А and x is a corresponding eigenvector, we have 


Ax = Ах = Ах (6) 


However, 4 — A, since А has real entries, so it follows from part (c) of Theorem 5.3.2 that 


Equations 6 and 7 together imply that 


in which x « 0 (why?); this tells us that X is an eigenvalue of A and x is a corresponding eigenvector. 


EXAMPLE 3 Complex Eigenvalues апа Eigenvectors + 


Find the eigenvalues and bases for the eigenspaces of 


Solution The characteristic polynomial of A is 
A+2 1 

—5 A-2 
so the eigenvalues of A are V = ¿į and д = — j. Note that these eigenvalues are complex conjugates, as 
guaranteed by Theorem 5.3.4. 


=? -1-2(A-i(-i) 


To find the eigenvectors we must solve the system 


КАШЫН 


with д = ғ and then with 4 — — ;. With À = j, this system becomes 


i+2 1 xij |0 (8 
—5 i-2| [521 [0 ) 
We could solve this system by reducing the augmented matrix 
i+2 1 0 (9) 
—5 i-2 0 
to reduced row echelon form by Gauss-Jordan elimination, though the complex arithmetic is somewhat tedious. 
A simpler procedure here is first to observe that the reduced row echelon form of 9 must have a row of zeros 
because 8 has nontrivial solutions. This being the case, each row of 9 must be a scalar multiple of the other, and 


hence the first row can be made into a row of zeros by adding a suitable multiple of the second row to it. 
Accordingly, we can simply set the entries in the first row to zero, then interchange the rows, and then multiply 


the new first row by -i to obtain the reduced row echelon form 
1. 
2-і 0 
si 
0 0 0 
Thus, a general solution of the system is 
х] = [-3 + i. x2-—í 


This tells us that the eigenspace corresponding to А = į is one-dimensional and consists of all complex scalar 
multiples of the basis vector 


2 1; 
х=| 5 " БИ (10) 
1 
As a check, let us confirm that Ах — jx. We obtain 
2 1. 
-2|-2-zi|-1 
-2 -1]/-24+4 | 5 3 ~1_2,] | 
{а= 7,15 "5 |= H =| 5 5|zix 
1 5(— $4 Gi) +2 i 


We could find a basis for the eigenspace corresponding to д — — ; in a similar way, but the work is unnecessary, 


since Theorem 5.3.4 implies that 


:2 їй 
х= 9: 3 (11) 
1 
must be a basis for this eigenspace. The following computations confirm that x is an eigenvector of A 
corresponding to д = — j: 
2 _1 
E: -2 -1||-2-z 
Ах = 
| f J 5 
1 
2. 1. 
-2 | -573 | = ic, d 
5 5 e ЖИЛИК SH 
Е =| 75t 5 |= -3 
s(-2- i) +2 -i 


Since a number of our subsequent examples will involve 2 ж 2 matrices with real entries, it will be useful to discuss some 
general results about the eigenvalues of such matrices. Observe first that the characteristic polynomial of the matrix 


-[ 


ds |- (A—a) (A-d) -be - M — (a +d)A+ (ad — bc) 


A—d 


We can express this in terms of the trace and determinant of А as 


ае — А) = 


det(A — A) = M = (АЈА + det(A) (12) 
from which it follows that the characteristic equation of A is 
M — tr(A)A + det(A) = 0 (13) 
Now recall from algebra that if ах? ++ bx + c = 0 is a quadratic equation with real coefficients, then the discriminant 
b? — дас determines the nature of the roots: 
b? — 4ac >0 [ Two distinct real roots] 
b? —4ac =0 [One repeated real root] 


b? —4ac <0 [Two conjugate imaginary roots] 
Applying this to 13 with g = 1, è = = (А), and c = det{ A) yields the following theorem. 


Olga Taussky-Todd (1906—1995) 


Historical Note Olga Taussky-Todd was one of the pioneering women in matrix analysis and the first woman 
appointed to the faculty at the California Institute of Technology. She worked at the National Physical Laboratory in 
London during World War II, where she was assigned to study flutter in supersonic aircraft. While there, she realized 
that some results about the eigenvalues of a certain 6 х 6 complex matrix could be used to answer key questions about 
the flutter problem that would otherwise have required laborious calculation. After World War II Olga Taussky-Todd 
continued her work on matrix-related subjects and helped to draw many known but disparate results about matrices 
into the coherent subject that we now call matrix theory. 

[/mage: Courtesy of the Archives, California Institute of Technology] 


THEOREM 5.3.5 


If A is a 2 x 2 matrix with real entries, then the characteristic equation of A is Y= tr( AJA + det(A) = 0 and 
(a) A has two distinct real eigenvalues if r(A)? — 4 det(A) > 0; 
(b) A has one repeated real eigenvalue if tr(.A)? — 4 det(A) = 0; 


(c) A has two complex conjugate eigenvalues if tr(A)? — 4 det(A) < 0. 


EXAMPLE 4 Eigenvalues of a 2 x 2 Matrix + 


In each part, use Formula 13 for the characteristic equation to find the eigenvalues of 


(a) 4. 22 
4- |; 


(b) „_ 0 —1 
ШЕ 

© ,_| 2 3 
[32 

Solution 


(а) We have tr(.4) = 7 and det(A) = 12, so the characteristic equation of A is 
M-7A4 1220 
Factoring yields (А — 4) (A — 3) = 0, so the eigenvalues of A are \ = 4 and å = 3. 
(b) We have {г( А) = 2 and det(.4) = 1, so the characteristic equation of A is 
A224 120 
Factoring this equation yields (À — 1? = б, so Д = 1 is the only eigenvalue of A; it has algebraic 
multiplicity 2. 
(c) We have tr(.4) = 4 and det(A) = 15, so the characteristic equation of A is 
A 2413-0 
Solving this equation by the quadratic formula yields 


2 
NEET/EPQEEROI == ILLE zd 3 


2 
Thus, the eigenvalues of A are А = 2 + 3i and V = 2 — 3j. 


Symmetric Matrices Have Real Eigenvalues 


Our next result, which is concerned with the eigenvalues of real symmetric matrices, is important in a wide variety of 
applications. The key to its proof is to think of a real symmetric matrix as a complex matrix whose entries have an imaginary 
part of zero. 


THEOREM 5.3.6 


If A is a real symmetric matrix, then A has real eigenvalues. 


Proof Suppose that Д is an eigenvalue of А and x is a corresponding eigenvector, where we allow for the possibility that A is 
complex and x is in C”. Thus, 


Ax = Ах 
where x #0. If we multiply both sides of this equation by x! and use the fact that 


x7 Ax =x? (Ax) = ^x) = Xx: x) = А|х|? 
then we obtain 
A— x’ Ax 
15112 


Since the denominator in this expression is real, we can prove that X is real by showing that 


x7 Ax =x! Ax (14) 


But, A is symmetric and has real entries, so it follows from the second equality in 14 and properties of the conjugate that 


x Ax = X Ax =x Ax = (Ax) x - (Жү x = (Ax) х = x! ATx — x Ax 


A Geometric Interpretation of Complex Eigenvalues 


The following theorem is the key to understanding the geometric significance of complex eigenvalues of real 2 x 2 matrices. 


THEOREM 5.3.7 


The eigenvalues of the real matrix 


_|а —b 
с= | | (15) 


are X = a + bi. If a and b are not both zero, then this matrix can be factored as 
á =b | 9 [соф —sinó 
— . (16) 
b а 0 ||| sind созф 


where ф is the angle from the positive x-axis to the ray that joins the origin to the point (a, 5) (Figure 5.3.2). 


(a, b) 


Figure 5.3.2 


Geometrically, this theorem states that multiplication by a matrix of form 15 can be viewed as a rotation through the angle ф 
followed by a scaling with factor |A| (Figure 5.3.3). 


АЎ scaled ^Я Сх 
/ Rotated 


— 
№; 


Figure 5.3.3 


Proof The characteristic equation of C is (А — a) 2 +b 220 (verify), from which it follows that the eigenvalues of C are 


A= a Æ bi. Assuming that a and b are not both zero, let ф be the angle from the positive x-axis to the ray that joins the origin 
to the point (a, b). The angle ф is an argument of the eigenvalue \ = д + bi, so we see from Figure 5.3.2 that 


а= |соз$Ф and b= [nó 


It follows from this that the matrix in 15 can be written as 


T MENT 
a s) [M oO] А [BI о Гео -sino 
b al |0 [| b. a | |0 А || sind  cosó 
А| A| 


The following theorem, whose proof is considered in the exercises, shows that every real 2 x 2 matrix with complex 
eigenvalues is similar to a matrix of form 15. 


THEOREM 5.3.8 


Let А be a real 2 x 2 matrix with complex eigenvalues д = g + bi (where b = 0). If x is an eigenvector of A 
corresponding to А — g — bi, then the matrix P = [Re (x) Im(x) ] is invertible and 


а= |$ e (17) 


EXAMPLE 5 A Matrix Factorization Using Complex Eigenvalues + 


Factor the matrix in Example 3 into form 17 using the eigenvalue  — —; and the corresponding eigenvector 
that was given in 11. 


Solution For consistency with the notation in Theorem 5.3.8, let us denote the eigenvector in 11 that 


corresponds to д = —j by x (rather than x as before). For this à and x we have 
2 A. 
a=0, b=1, Re()-| 5| ш(х)=| 5 
1 0 
Thus, 
_2 
Р= [ Ке(х) Im(x) ] = 5 > 
1 0 


so A can be factored in form 17 as 


Safe Sit E LÁ 
5 2 1 0 1 0]|[-5 -2 


You may want to confirm this by multiplying out the right side. 


A Geometric Interpretation of Theorem 5.3.8 


To clarify what Theorem 5.3.8 says geometrically, let us denote the matrices on the right side of 16 by S and Ан, respectively, 
and then use 16 to rewrite 17 as 


А РР РМ 0 || 2090 =p- 
Е 1 10 А| || sing cose (18) 


If we now view P as the transition matrix from the basis 8 = (Re(x), Im(x)) to the standard basis, then 18 tells us that 
computing a product Ax, can be broken down into a three-step process: 


Step 1 Map xg from standard coordinates into B-coordinates by forming the product р xp 
Step 2 Rotate and scale the vector Р yg by forming the product SR „Р xp 


Step 3 Map the rotated and scaled vector back to standard coordinates to obtain 4xg = PSRP xp. 


Power Sequences 


There are many problems in which one is interested in how successive applications of a matrix transformation affect a specific 
vector. For example, if A is the standard matrix for an operator on R” and хо is some fixed vector іп 5”, then one might be 
interested in the behavior of the power sequence 


Xp, 40, A’xy, ..., A" xy, ... 


For example, if 
1 3 
2 4 1 
А= 3 11 and хо = | 1 
10 


then with the help of a computer or calculator one can show that the first four terms in the power sequence are 


1 1.25 1.0 0.35 
we) Amm [S] mn Las) Los] 


With the help of MATLAB or a computer algebra system one can show that if the first 100 terms are plotted as ordered pairs 
(x, у), then the points move along the elliptical path shown in Figure 5.3.4a. 


АУ А> 
Xp = (1. 1) 
1 H "= ow L [e тм. p. ^ 
s E .* E ` 
5 z 7 JJ * 
b Ax, Ba P ! 
Н ! ' E: Ё i 
l i i5 a 
А > $ 
1 UE: Sq le? 
„%А°?х, P *; 
i E i : i 
L| E "d 1 1 
$ . $ Е ` 
x n VAS s et s 
^ -]4- DC ^ m" ` P 
~- ө * ~. ео. je» ** “+ 
Ax, 


(a) (b) (c) 


Figure 5.3.4 


To understand why the points move along an elliptical path, we will need to examine the eigenvalues and eigenvectors of A. 
We leave it for you to show that the eigenvalues of A are А = 2 Æ 2; and that the corresponding eigenvectors are 


м=@-# "={ф +1) апа х= +4: v= (2-1) 
If we take A= Ay = 1 =- 2: and x = v, = i +i, 1) in 17 and use the fact that А| = 1, then we obtain the factorization 
1 32 4 _3 
2 4| _ li |5 75||? | 
-4 П 10]]|2 2) (1 72 (19) 
A — P Ry Р! 


where R,; is a rotation about the origin through the angle @ whose tangent is 


snó 3/5 3 
соф 4/5 4 


The matrix P in 19 is the transition matrix from the basis 


B= (Re(x), Im(x)} = (- 1}, (1, o} 


to the standard basis, and Р —! is the transition matrix from the standard basis to the basis B (Figure 5.3.5). Next, observe that 
if n is a positive integer, then 19 implies that 


Ао = (PRP | xo = РЕЇР^Їх 


so the product А” ху can be computed by first mapping xy into the point р xp in B-coordinates, then multiplying by А) to 
rotate this point about the origin through the angle миф, and then multiplying RS P 159 by Р to map the resulting point back to 


standard coordinates. We can now see what is happening geometrically: In B-coordinates each successive multiplication by А 
causes the point p ше to advance through an angle ф, thereby tracing a circular orbit about the origin. However, the basis В 
is skewed (not orthogonal), so when the points on the circular orbit are transformed back to standard coordinates, the effect is 
to distort the circular orbit into the elliptical orbit traced by A"xg (Figure 5.3.45). Here are the computations for the first step 
(successive steps are illustrated in Figure 5.3.4c): 


l 3 223 
2 4 |] 2 1 5 —5 |101 |n 
_з ип} роза i -li 
5 10 345 
= |2 3 4 1| [xo is mapped to 2 — coordinates . | 
1 0 = 5 2 
= |2 2 | The point(I, 2) rotated through the angle b. | 
1 O}} 1 
2 
= : | The point | > (1 mapped to standard coordinates . | 
2 


Im(x) (1, 0) 


Figure 5.3.5 


Concept Review 


Real part of z 


Imaginary part of z 
Modulus of z 


Complex conjugate of z 


Argument of z 


Polar form of z 


Complex vector space 


Complex n-tuple 


Complex n-space 


Real matrix 


Complex matrix 


Complex Euclidean inner product 


Euclidean norm on C” 


* Antisymmetry property 
* Complex eigenvalue 

* Complex eigenvector 

* Eigenspace in C” 

* Discriminant 


Skills 

* Find the real part, imaginary part, and complex conjugate of a complex matrix or vector. 
* Find the determinant of a complex matrix. 

* Find complex inner products and norms of complex vectors. 

* Find the eigenvalues and bases for the eigenspaces of complex matrices. 


e Factor a 2 x 2 real matrix with complex eigenvalues into a product of a scaling matrix and a rotation matrix. 


Exercise Set 5.3 


In Exercises 1—2, find а, Re(u), Im(u), and [[u||. 
j1.0—(2-i,44, 1 +i) 
Answer: 


u—(2--i, — 4i, 1—i), Re (ш) = (2,0, 1), Im (ш) = (—1,4, 1), llull = 23 
2.0 = (6, 1 + 4i, 6 — 2i) 


In Exercises 3—4, show that u, v, and & satisfy Theorem 5.3.1. 

3,u= (3 = 4i, 2 +i, -6),v- (1 +i, 2-i 4), k—i 

4.u = (6,14 43,6=—23), v= (4, 3 + 2i, i — 3), k = =i 

5. Solve the equation ix — 3v = u for x, where u and v are the vectors in Exercise 3. 
Answer: 


х=(7—6, —4 — 8i, 6 = 121) 
6. Solve the equation (1 + i)x + 20 = v for x, where u and v are the vectors in Exercise 4. 


In Exercises 7-8, find 4, Re(A), Im(.4), det(A), and tr( A). 


T, |-5 4 
а= |55 PA 


Answer: 


i| s 4 | » а= || in (4) =| 7? jJ det(A) 17 —i, (А) =1 


2+? 1—5 2.1 
8. 4 2-3 
А= 

bs] 


9. Let A be the matrix given in Exercise 7, and let B be the matrix 


at 


Confirm that these matrices have the properties stated in Theorem 5.3.2. 


10. Let A be the matrix given in Exercise 8, and let B be the matrix 
5i 
B= 
| 1- " 
Confirm that these matrices have the properties stated in Theorem 5.3.2. 


In Exercises 11—12, compute y - y, u · w, and y · w, and show that the vectors satisfy Formula 5 and parts ( a), ( b), and ( c) 
of Theorem 5.3.3. 


11. u= (i, 2i, 3), v= (4, —2i, 1 +i), w= (2—i, 2i, 5 + 33), k= 2i 
Answer: 


оу —1l-4iu:w-—18-—7i v-w= 124 63 
12.0= (1 +i, 4, 3i), v—(5,—4i2--3), w-—(l-i4ií4-5i), k=1+4: 


13. Compute (u · v) —wW~ її for the vectors u, v, and w in Exercise 11. 


Answer: 


—-11- 14i 


14. Compute (iu . w) ++ C[[u]|v) * u for the vectors u, v, and w in Exercise 12. 


In Exercises 15—18, find the eigenvalues and bases for the eigenspaces of A. 


15. ,. [4 —5 
-[ 7%] 


Answer: 


Ay =2=3, “= (27 №=2 +1, = | E] 


КЕЕ 


4 7 
17. 5 —2 
A= 
mE 
Answer: 


Ay =4—, Еи Ag =4 +i, a=] 


18., [86 
|54] 


In Exercises 19—22, each matrix C has form 15. Theorem 5.3.7 implies that C is the product of a scaling matrix with factor 


|A| and a rotation matrix with angle Ф. Find |А| and ф for which =r < ó « s. 
19. 1 -1 

С = 

Answer: 


у= ү2, ф=® 


20.4 [05 
es o] 


21. 1 v3 
c= 
-/3 1 
Answer: 
А|= 2, ф= -3 


dq e 


In Exercises 23-26, find an invertible matrix P and a matrix C of form 15 such that 4 — pc'p-1. 


PERLES 


Des /2 ¥2 


4 3 


Answer: 
=2 =] 3 =2 
al [2 53) 


Кг 


1 0 
25. в 6 
A= 
1-32) 
Answer: 


- hel 


26. 5 —2 
А= 


27. Find all complex scalars k, if any, for which u and v are orthogonal in (72. 
(a) u= (2i, i, 3), v= (1, 6,4) 
(b) u= (k, k, 1 +i), у=(1,—1,1—7) 


Answer: 


© k= – 23 
(b) None 
28. Show that if A is a real x x ә matrix and x is a column vector in C”, then Re( Ax) = A(Re(x)) and Im(Ax) = A(Im(x)). 


29. The matrices 
wal? 1) „_[%-] „_[1 o 
Ebene eae ОК а 


called Pauli spin matrices, are used in quantum mechanics to study particle spin. The Dirac matrices, which are also used 
in quantum mechanics, are expressed in terms of the Pauli spin matrices and the 2 x 2 identity matrix 74 as 


(a) Show that 8? = a2 = o = az. 


(b) Matrices А and B for which 45 = — ВА are said to be anticommutative. Show that the Dirac matrices are 
anticommutative. 
30. If k is a real scalar and v is a vector in R”, then Theorem 3.2.1 states that ||4v|| = |&|||v||. Is this relationship also true if k 


is a complex scalar and v is a vector in С??? Justify your answer. 
31. Prove part ( c) of Theorem 5.3.1. 
32. Prove Theorem 5.3.2. 


33. Prove that if u and v are vectors in С”, then 


1 2. 1 2 
u:v— || || — 2||u— v 
а vi? - Ion vi 
i 2 2 l2 
+ [ud r| = -|u-iv 
+ E+ iv? - flu ivl 
34. It follows from Theorem 5.3.7 that the eigenvalues of the rotation matrix 
соф -—sinó 
Re = | б, А | 
sin coso 
аге А = cosó + ising. Prove that if x is an eigenvector corresponding to either eigenvalue, then Re(x) and Im(x) are 


orthogonal and have the same length. [Note: This implies that P = [Re(x)Im(x)] is a real scalar multiple of an 
orthogonal matrix.] 


35. The two parts of this exercise lead you through a proof of Theorem 5.3.8. 


a —b 
ar 
and let u = Re(x) and v = Im(x), so P = [цу]. Show that the relationship 4x = Ах implies that 
Ax = (au + bv) -- i( — bu + av) 
and then equate real and imaginary parts in this equation to show that 
AP = [Au|Av] = [au + dv | bu c av] = РМ 


(a) For notational simplicity, let 


(b) Show that Р is invertible, thereby completing the proof, since the result in part (a) implies that 4 = PAP 1. [Hint: If 
P is not invertible, then one of its column vectors is a real scalar multiple of the other, say y — cy. Substitute this into 
the equations Au = au + bv and Ду = — bu + av obtained in part (а), and show that (1 4 c^)bu = 0. Finally, show 


that this leads to a contradiction, thereby proving that P is invertible.] 


36. In this problem you will prove the complex analog of the Cauchy-Schwarz inequality. 
(a) Prove: If k is a complex number, and и and v are vectors in C”, then 


(u—kv) (u= kv) = u-u—k(u- v) – k(u- v) --kk(v - v) 


(b) Use the result in part (a) to prove that 
0c€u-u-—k(u:v) —k(u- v) --kk(v - v) 


(c) Take = (u - v) / (w+ v) in part (b) to prove that 
ju- ¥| = llull [lvl 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 


(a) There is a real 5 x 5 matrix with no real eigenvalues. 


Answer: 


False 


(b) The eigenvalues of a 2 x 2 complex matrix are the solutions of the equation А? tr( AJA + det(A) = 0. 


Answer: 


True 


(c) Matrices that have the same complex eigenvalues with the same algebraic multiplicities have the same trace. 
Answer: 


False 


(d) If à is a complex eigenvalue of a real matrix А with a corresponding complex eigenvector v, then X is a complex 
eigenvalue of A and v is a complex eigenvector of A corresponding to Д. 


Answer: 


True 


(e) Every eigenvalue of a complex symmetric matrix is real. 
Answer: 


False 


(f) If a 2 x 2 real matrix A has complex eigenvalues and хо is a vector in 22, then the vectors Хү, Axg, Axa, .., A" xg, ... lie 


on an ellipse. 
Answer: 


False 
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5.4 Differential Equations 


Many laws of physics, chemistry, biology, engineering, and economics are described in terms of “differential 
equations"—that is, equations involving functions and their derivatives. In this section we will illustrate one way in 
which linear algebra, eigenvalues and eigenvectors can be applied to solving systems of differential equations. 
Calculus is a prerequisite for this section. 


Terminology 


Recall from calculus that a differential equation is an equation involving unknown functions and their derivatives. 
The order of a differential equation is the order of the highest derivative it contains. The simplest differential 
equations are the first-order equations of the form 


у! = ау (1) 


where у = f (х) is an unknown differentiable function to be determined, у! = dy İ dx is its derivative, and a is a 


constant. As with most differential equations, this equation has infinitely many solutions; they are the functions of the 
form 


у = cet (2) 
where с is an arbitrary constant. That every function of this form is a solution of 1 follows from the computation 
y! =cae™ = ay 
and that these are the only solution is shown in the exercises. Accordingly, we call 2 the general solution of 1. As an 
example, the general solution of the differential equation y' — 5y is 


y=ce™ (3) 


Often, a physical problem that leads to a differential equation imposes some conditions that enable us to isolate one 
particular solution from the general solution. For example, if we require that solution 3 of the equation y! = 5у 


satisfy the added condition 


у(0) =6 (4) 
(that is, у = 6 when x = Q), then on substituting these values in 3, we obtain 6 = се? = c, from which we conclude 
that 
у = 6e? 


is the only solution у' = 5y that satisfies 4. 


A condition such as 4, which specifies the value of the general solution at a point is called an initial condition, and 
the problem of solving a differential equation subject to an initial condition is called an initial-value problem. 


First-Order Linear Systems 


In this section we will be concerned with solving systems of differential equations of the form 


yt = ауу d 
yy = 

Р 

Yn = Guy 


41232 +..-4 
42171 + @22¥2 +... dG234Yn 


+ Gaya +...+ 


ain¥n 


Ann 


where уу = ў (х), уз = falx), -n Ум = J nlx) are functions to be determined, and ће @;+'5 are constants. In 


matrix notation, 5 can be written as 


' 
X1 G1] d12 
у" | |4z2 422 
Р йу] бид --- 
Yn 
or, more briefly as 
A system of differential equations of form 5 is 
called a first-order linear system. 
' 
y = Ау 


- &in || 1 
- 42и || Y2 
апп || Yn 


where the notation y" denotes the vector obtained by differentiating each component of y. 


EXAMPLE 1 Solution of a Linear System with Initial Conditions 


(а) Write the following system in matrix form: 
y'i 
yy 
y's 

(b) Solve the system. 


< 


3y1 
—2y3 
5y3 


(7) 


(c) Find a solution of the system that satisfies the initial conditions y, (0) = 1, y5(0) = 4, and 


y3(0) = = 2. 
Solution 
(a) y! Я 
ys |= |0 
» 0 
or 
3 
у'=|0 
0 


—2 


(8) 


(9) 


(5) 


(6) 


(b) Because each equation in 7 involves only one unknown function, we can solve the equations 
individually. It follows from 2 that these solutions are 


уу = ае? 
y? = се” 
уз = cz” 
or, in matrix notation, 
yi e ot 
у= |72 |= | ee (10) 
УЗ me 5x 


(c) From the given initial conditions, we obtain 


0 
1 = }у1(0 =cie =e; 
0 
4 = y2(0) =с2е =с2 
—2 = у3(0) =с3е0 =сз 
so the solution satisfying these conditions is 
y=”, уз=4е 7, уз= —2е” 
or, in matrix notation, 
УІ г? 
y3 
—2е?* 


Solution by Diagonalization 


What made the system in Example 1 easy to solve was the fact that each equation involved only one of the unknown 
functions, so its matrix formulation, y^ = Ay, had a diagonal coefficient matrix A [Formula 9]. А more complicated 
situation occurs when some or all of the equations in the system involve more than one of the unknown functions, for 
in this case the coefficient matrix is not diagonal. Let us now consider how we might solve such a system. 


The basic idea for solving a system y' — Ду whose coefficient matrix А is not diagonal is to introduce a new 
unknown vector u that is related to the unknown vector y by an equation of the form y — Pu in which P is an 
invertible matrix that diagonalizes А. Of course, such a matrix may or may not exist, but if it does then we can rewrite 
the equation y! = Ау as 
Pu' — A(Pu) 

or alternatively as 

u = (Р ЛАР\п 
Since P is assumed to diagonalize А, this equation has the form 


u = Du 


where D is diagonal. We can now solve this equation for u using the method of Example 1, and then obtain y by 
matrix multiplication using the relationship y — Pu. 


In summary, we have the following procedure for solving a system y' = Ау in the case were А is diagonalizable. 


A Procedure for Solving y' = Ay if A is Diagonalizable 


Step 1. Find a matrix P that diagonalizes A. 

Step 2. Make the substitutions y — Pu and y! = Ри to obtain a new “diagonal system" y! — Py, where 
D= p Ap. 

Step 3. Solve ц! — Dy. 

Step 4. Determine y from the equation y = Pu. 


EXAMPLE 2 Solution Using Diagonalization + 


(а) Solve the system 
y= Mm *» 
yy = 4x — 2y2 


(b) Find the solution that satisfies the initial conditions y; (0) = 1, y5(0) = 6. 


Solution 


(a) The coefficient matrix for the system is 


[i 


As discussed in Section 5.2, A will be diagonalized by any matrix P whose columns are linearly 
independent eigenvectors of A. Since 


А-1 <1 
—4 A+2 


the eigenvalues of A are V = 2 and V = — 3. By definition, 
X1 
х= |>, 
is an eigenvector of А corresponding to A if and only if x is а nontrivial solution of 


[e ase] [2] = [o] 
[a 5] 


Solving this system yields x, = £, x3 =, so 


де (М — А) = 


=x +Х—6= (А+3)(А—2) 


If å = 2, this system becomes 


Thus, 


is a basis for the eigenspace corresponding to  — 2. Similarly, you can show that 


e 
р2 = 
1 
is a basis for the eigenspace corresponding їо  — = 3. Thus, 

1 
1 =+ 
Р= 4 
1 1 


diagonalizes A, and 


0 -3 
Thus, as noted in Step 2 of the procedure stated above, the substitution 
y = Pu and y! = Pu’ 


D-plap- E 2 


yields the “diagonal system" 


П 
itx е0 цу = 2uy 
u -m-[ M or 


From 2 the solution of this system is 


uy — суе cje” 
or u= 
uj  —cg се? 


so the equation у = Fu yields, as the solution for y, 


yı] |1 =t || eie? cie? — lcg 
у= Уз = 4 3 = 
1 1 || £22 cje” + cae a 
or 
ур = ce” Lez = 
(11) 
уз = cje” cgo | 


(b) If we substitute the given initial conditions in 11, we obtain 
1 
ci ===] 
1 4 2 
с1+єс2=6 
Solving this system, we obtain c, = 2, сэ = 4, so it follows from 11 that the solution satisfying 
the initial conditions is 


у = 267 _ e 


y=” + 447 


Remark Keep in mind that the method of Example 2 works because the coefficient matrix of the system can be 
diagonalized. In cases where this is not so, other methods are required. These are typically discussed in books 
devoted to differential equations. 


Concept Review 


Differential equation 


Order of a differential equation 


General solution 


Particular solution 


Initial condition 


Initial-value problem 


First-order linear system 


Skills 
* Find the matrix form of a system of linear differential equations. 
* Find the general solution of a system of linear differential equations by diagonalization. 


* Find the particular solution of a system of linear differential equations satisfying an initial condition. 


Exercise Set 5.4 


1. (a) Solve the system 


X1 + 4у2 
ày| + 3y2 


yi 
>, 


(b) Find the solution that satisfies the initial conditions y, (0) = 0, y3(0) = 0. 


Answer: 


(а) уу = cje” = 2c” 


ya =c] pe” 
(b) vi =9 


y2=0 


2. (a) Solve the system 


у = wm + 3» 
X 4ур + 5у2 


(b) Find the solution that satisfies the conditions у (0) = 2, y3(0) = 1. 


3. (a) Solve the system 


у = 4x + уз 
» = 2р] + уз 
ж = —2у| + уз 


(b) Find the solution that satisfies the initial conditions y; (0) = — 1, y5(0) = 1, y3(0) = 0. 


Answer: 


(a) yy = =c” + ce? 


y2— cie" + 2e” — cse?" 
уз = 2e” — cse?" 
(0) yy= g^ = 2e” 
уз =g” — 2e” + 267 
уз = — 2e” + 297 


4. Solve the system 


yi = 4уү+2у2+2уз 
y») = ЗЭуу+4у2+2уз 
уз = 2у1 + 232 + 4у3 


5. Show that every solution of y" = ay has the form у = ce”. 


[Hint: Let y = f (x) be a solution of the equation, and show that fixe тах is constant. ] 


6. Show that if A is diagonalizable and 
Jl 
У2 
у= 
Yn 
is a solution of the system у' = Ay, then each y; is a linear combination of erik г^2^ Lu g^ , Where 
А, Ag, ..., Ay are the eigenvalues of А. 
7. Sometimes it is possible to solve a single higher-order linear differential equation with constant coefficients by 
expressing it as a system and applying the methods of this section. For the differential equation y" -y! -6y 20 


‚ Show that the substitutions y = y and уз = у' lead to the system 
X =» 
уз = &n*y 


Solve this system, and use the result to solve the original differential equation. 


Answer: 


3x 


у=сүе + c” 


8. Use the procedure in Exercise 7 to solve y” + y' — 12y = 0. 


9. Explain how you might use the procedure in Exercise 7 to solve y — 6y" + 11y" — бу = 0. Use your 
procedure to solve the equation. 


Answer: 


у= cie Heze” Heze” 


10. (a) By rewriting 11 in matrix form, show that the solution of the system in Example 2 can be expressed as 


1 
y-ae tls cg "| 4 
1 


This is called the general solution of the system. 


(b) Note that in part (a), the vector in the first term is an eigenvector corresponding to the eigenvalue A; = 2, and 
the vector in the second term is an eigenvector corresponding to the eigenvalue Аз = — 3. This is a special 
case of the following general result: 


Theorem. If the coefficient matrix A of the system y! = Ay is diagonalizable, then the general 
solution of the system can be expressed as 


AQx с An* 


у= cie x4 Heia +... oup x, 


where Ду, Ag, ..., Ay are the eigenvalues of A, and Xj is an eigenvector of A corresponding to A; . 


Prove this result by tracing through the four-step procedure preceding Example 2 with 
Aj 0б... 0 


0 Ag ... 0 


D= and P = [x1 x2]. Xn] 


оо. Ay 


11. Consider the system of differential equations у' = Ay, where A is a 23 2 matrix. For what values of 


211, 412, 421, 422 do the component solutions y (2), y5(£) tend to zero as ¢ — со? In particular, what must be 
true about the determinant and the trace of A for this to happen? 


12. Solve the nondiagonalizable system 
X = у + у 
» =» 


True-False Exercises 
In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 


(a) Every system of differential equations у' = Ay has a solution. 


Answer: 


False 


(b) If x’ = Ax and у' = Ay, then X — y. 
Answer: 


False 


(c) If x/ = Ах and y! = Ay, then (cx + dy)’ = Alex + dy) for all scalars c and d. 


Answer: 


True 


(d) If A is a square matrix with distinct real eigenvalues, then it is possible to solve x’ = Ах by diagonalization. 
Answer: 


True 


(e) If A and P are similar matrices, then y = Ay and u' = Pu have the same solutions. 


Answer: 


False 
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Chapter 5 Supplementary Exercises 


1. (a) Show that if 0 —  — q, then 


i cosh  —smnÜ 
sinf созӣ 


has no eigenvalues and consequently no eigenvectors. 


(b) Give a geometric explanation of the result in part (a). 
Answer: 


(b) The transformation rotates vectors through the angle 9; therefore, if () < 9 < m, then no nonzero vector 
is transformed into a vector in the same or opposite direction. 


2. Find the eigenvalues of 


0 1 0 
A-|0 0 1 
k? -3k? 3k 


*(a) Show that if D is a diagonal matrix with nonnegative entries on the main diagonal, then there is a 
matrix S such that 92 — p. 


(b) Show that if A is a diagonalizable matrix with nonnegative eigenvalues, then there is a matrix S such 


that $2 = А. 
(c) Find a matrix S such that 52 — 4, given that 
13 1 
А=|0 4 5 
00 9 
Answer: 
(с)|1 1 0 
0 21 
005 


4. Prove: If А is a square matrix, then А and 47 have the same characteristic polynomial. 


5. Prove: If A is a square matrix and p(À) = det(A/ — A) is the characteristic polynomial of A, then the 
coefficient of "71 in p{A) is the negative of the trace of A. 


- 1 


7. In advanced linear algebra, one proves the Cayley—Hamilton Theorem, which states that a square matrix 


6. Prove: If b # 0, then 


is not diagonalizable. 


A satisfies its characteristic equation; that is, if 
eg + А+ c3M +... + cO 471 +A" =0 
is the characteristic equation of A, then 


epi - c4A - c347 +... c, 44771 + А" =0 


Verify this result for 
3 6 0 10 
(а) A=|, „| (0) 4-|0 01 
1 =3 3 


In Exercises 8—10, use the Cayley—Hamilton Theorem, stated in Exercise 7. 


8. (a) Use Exercise 18 of Section 5.1 to prove the Cayley—Hamilton Theorem for 2 ж 2 matrices. 
(b) Prove the Cayley—Hamilton Theorem for у хз diagonalizable matrices. 
9. The Cayley—Hamilton Theorem provides a method for calculating powers of a matrix. For example, if A 
is a 2 x 2 matrix with characteristic equation 
co+cjà +à? =0 
then cg? +c] A+ А? = 0, so 
A= -c4A- су 
Multiplying through by А yields 4? = — с 4? — cgA which expresses д? in terms of 4? and A, and 
multiplying through by 4? yields 44 — — cy АЗ бй A2, which expresses 44 in terms of 4? апа А2. 
Continuing in this way, we can calculate successive powers of А by expressing them in terms of lower 
powers. Use this procedure to calculate A’, АЗ, А*, and 4? for 


3 6 
A= 
Hl 
Answer: 


2 [15 30] ,3 [75 150] ,4 [375 750] ,s [1875 3750 
: E ni E a Bs E [cs fe 


10. Use the method of the preceding exercise to calculate 4? and 4^ for 


Оо : 1-0 
А=|0 01 
1-233 


11. Find the eigenvalues of the matrix 

Ci £5 nl Sy 
£4 бо 204 
Bv ET 


Answer: 


0, (4) 


12. (a) It was shown in Exercise 17 of Section 5.1 that if A is an у х matrix, then the coefficient of å” in 
the characteristic polynomial of A is 1. (A polynomial with this property is called monic.) Show that 


the matrix 
000..0 c-eg 
100..0 —с 
L3 s. 


24 0 


000..1 = 
has characteristic polynomial 
Р(Х) 2eg + А+... c4 А” +А" 


This shows that every monic polynomial is the characteristic polynomial of some matrix. The matrix 
in this example is called the companion matrix of p (X). [Hint: Evaluate all determinants in the 
problem by adding a multiple of the second row to the first to introduce a zero at the top of the first 
column, and then expanding by cofactors along the first column.] 


(b) Find a matrix with characteristic polynomial 


ptr) 21-21 - M + ЗАЗ A 


13. A square matrix A is called nilpotent if A" — () for some positive integer n. What can you say about the 
eigenvalues of a nilpotent matrix? 


Answer: 


They are all 0. 
14. Prove: If A is an » x » matrix and n is odd, then A has at least one real eigenvalue. 


15. Find a 3 x 3 matrix A that has eigenvalues А = 0, 1, and —] with corresponding eigenvectors 


0 1 0 
1], | —1], |1 
—1 1 1 
respectively. 
Answer: 
1 0 0 
duod: 
nd Az 
1 _1 
5725 


16. Suppose that a 4 x 4 matrix А has eigenvalues Ay = 1, Ау = = 2, Аз = 3, and À4 = = 3. 
(a) Use the method of Exercise 16 of Section 5.1 to find det(A). 
(b) Use Exercise 5 above to find tr( A). 


17. Let A be a square matrix such that 4? — 4. What can you say about the eigenvalues of 4? 


Answer: 


They are all 0, 1, or — 1. 


18. (a) Solve the system 
ł 
Ур = yi+3y2 
y? = 2y +4у2 


(b) Find the solution satisfying the initial conditions y; (0) = 5 and y5(0) = 6. 
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| СНАРТЕВ 


Inner Product Spaces 


CHAPTER CONTENTS 


6.1. Inner Products 

6.2. Angle and Orthogonality in Inner Product Spaces 
6.3. Gram-Schmidt Process; OR-Decomposition 

6.4. Best Approximation; Least Squares 

6.5. Least Squares Fitting to Data 


6.6. Function Approximation; Fourier Series 


INTRODUCTION 


In Chapter 3 we defined the dot product of vectors in R”, and we used that concept to 
define notions of length, angle, distance, and orthogonality. In this chapter we will 
generalize those ideas so they are applicable in any vector space, not just R”. We will also 
discuss various applications of these ideas. 
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6.1 Inner Products 


In this section we will use the most important properties of the dot product on Д" as axioms, which, if satisfied by the vectors 
in a vector space V, will enable us to extend the notions of length, distance, angle, and perpendicularity to general vector 
spaces. 


General Inner Products 


In Definition 4 of Section 3.2 we defined the dot product of two vectors in 2”, and in Theorem 3.2.2 we listed four 
fundamental properties of such products. Our first goal in this section is to extend the notion of a dot product to general real 
vector spaces by using those four properties as axioms. We make the following definition. 


Note that Definition 1 applies only to real vector 
spaces. A definition of inner products on complex 
vector spaces is given in the exercises. Since we will 
have little need for complex vector spaces from this 
point on, you can assume that all vector spaces under 
discussion are real, even though some of the theorems 
are also valid in complex vector spaces. 


DEFINITION 1 


An inner product on a real vector space V is a function that associates a real number (u, v ) with each pair of vectors in 
V in such a way that the following axioms are satisfied for all vectors u, v, and w in V and all scalars К. 


1. (ч, v) = (v, u} [Symmetry axiom] 

2. {u +v, w) = (и, w) + (v, w) [Additivity axiom] 

3. (ku, vy= k(u, v) [Homogeneity axiom] 

4. (v, v) > 0 and (у, v) = 0 if and only if y = 0 [Positivity axiom] 


A real vector space with an inner product is called a real product space. 


Because the axioms for a real inner product space are based on properties of the dot product, these inner product space axioms 
will be satisfied automatically if we define the inner product of two vectors u and v in 2” to be 

(u, vj =u- v —u,v, + u5v + + Чуу, 
This inner product is commonly called the Euclidean inner product (or the standard inner product) on R" to distinguish it 
from other possible inner products that might be defined on R”. We call R” with the Euclidean inner product Euclidean 
n-space. 


Inner products can be used to define notions of norm and distance in a general inner product space just as we did with dot 
products in д, Recall from Formulas 11 and 19 of Section 3.2 that if u and v are vectors in Euclidean n-space, then norm and 
distance can be expressed in terms of the dot product as 


lvl — yv-v and (о, v) = | | = y (u7 v) - (u= v) 


Motivated by these formulas we make the following definition. 


DEFINITION 2 


If V is a real inner product space, then the norm (or length) of a vector у in V is denoted by ||v|| and is defined by 
lvl = y (v, v) 
and the distance between two vectors is denoted by d (u, v) and is defined by 
d(u, v) = |а = v|| = {u—v,u—v} 


A vector of norm | is called a unit vector. 


The following theorem, which we state without proof, shows that norms and distances in real inner product spaces have many 
of the properties that you might expect. 


THEOREM 6.1.1 


If u and v are vectors in a real inner product space У, and if k is a scalar, then: 
(а) ||v|| > 0 with equality if and only if y = 0. 

(b) || = |1111. 

(c) d(u, v) =d (v, u). 

(d) d (u, v) > 0 with equality if and only if y = ү. 


Although the Euclidean inner product is the most important inner product on E", there are various applications in which it is 
desirable to modify it by weighting each term differently. More precisely, if 
W1, №2, ..., Wn 


are positive real numbers, which we will call weights, and if u = (14, 12, ..., Чу) and v = (у, v3, -- Vy) are vectors in А”, 
then it can be shown that the formula 
(u, V)—wiu|v| F w3u3v3-- * иур (1) 
defines an inner product on R” that we call the weighted Euclidean inner product with weights w1, W3, .... Wy. 
Note that the standard Euclidean inner product is the 


special case of the weighted Euclidean inner product in 
which all the weights are 1. 


EXAMPLE 1 Weighted Euclidean Inner Product + 
Let u = (t1, из) and v = (v4, уз) be vectors in 22. Verify that the weighted Euclidean inner product 
(а, v) = Зи цу + 2u3v3 (2) 
satisfies the four inner product axioms. 


Solution Axiom 1: Interchanging и and v in Formula 2 does not change the sum on the right side, so 


а, у) (v, ч). 


Axiom 2: If w= (у, w2), then 


(u+ v, w) 3(u1 + v1)w + 2(u3 + v3)w3 


3S(u1W| + v1w1) + 2(u2w2 + v2w3) 


(3u1w + 2u3w3) + (Зури + 2v2w3) 


II 


(aw) + е) 


Axiom 3: 


(гь v) 


3(Жи ү)» + 2(ku5)v3 
#(Зиуу\ + 2u3v3) 
k(u, v) 


Axiom 4: (v. v) = 3(v1v1) + 2(v2v3) = 3v? 4 2v > 0 with equality if and only if v; = уз = 0; that is, if 
and only if y — Q. 


In Example 1, we are using subscripted w's to 
denote the components of thevector w, not the 
weights. The weights are the numbers 3 and 2 in 
Formula 2. 


An Application of Weighted Euclidean Inner Products 


To illustrate one way in which a weighted Euclidean inner product can arise, suppose that some physical experiment has n 
possible numerical outcomes 


X1, X2. -- Хи 


and that a series of m repetitions of the experiment yields these values with various frequencies. Specifically, suppose that X 1 
occurs f, times, X2 occurs f 3 times, and so forth. Since there are a total of m repetitions of the experiment, it follows that 


Fitfates +fr=m 


Thus, the arithmetic average of the observed numerical values (denoted by x) is 


х= eh fata. = Limit Sarat +S nen) (3) 
If we let 
f = (f1/2--+fn) 
X = (X|,X3,.., Xy) 
М] = W=.. =W= l/m 
then 3 can be expressed as the weighted Euclidean inner product 
x — (f, x) =w 1x1 waf 2x2 + Wn] năn 


EXAMPLE 2 Using a Weighted Euclidean Inner Product — 


It is important to keep in mind that norm and distance depend on the inner product being used. If the inner product 
is changed, then the norms and distances between vectors also change. For example, for the vectors u — (1, 0) and 
v = (0, 1) in R? with the Euclidean inner product we have 


lull = 4124-0271 
d(u, v) = |u- vll = [|(1, — DI = 412 (2 1)2 = 2 


but if we change to the weighted Euclidean inner product 


and 


(u, v) = 3u1v1 + 2u5v2 
we have 
lull = (о, uj? = [3(1) (1) + 200) (0)]!? = үз 
and 


d(uv) = |lu—vi|=((1, – 1), (1, - рр 
[3(1) (1) +2(-1)(- 0.2 = 5 


Unit Circles and Spheres in Inner Product Spaces 


If V is an inner product space, then the set of points in V that satisfy 
їч = 1 


is called the unit sphere or sometimes the unit circle in V. 
EXAMPLE 3 Unusual Unit Circles in R^ — 


(a) Sketch the unit circle in an xy-coordinate system in 22 using the Euclidean inner product 
(u, v) — u1v + u3v2. 
(b) Sketch the unit circle in an xy-coordinate system in R? using the weighted Euclidean inner product 
1 1 
u, V| = Suv + 7U2V2. 
| | 94171 + 44272 


Solution 
(а) Ifu= (x, y), then ||u|| = (и, uj? = yx? ü y^. so the equation of the unit circle is Үх2+›2= 1, or, on 
squaring both sides, 
x44 у? = | 


As expected, the graph of this equation is a circle of radius 1 centered at the origin (Figure 6.1.1 а). 
(b) уы (x, у), then ||u|| = {ч, uj? = y 1? + D? , So the equation of the unit circle is 


y da? + ly? = 1, or, on squaring both sides, 


NP | 
4 


9 
The graph of this equation is the ellipse shown in Figure 6.1.15. 


{ч = 1 


х 


(а) The unit circle using 
the standard Euclidean 
inner product. 


(b) The unit circle using 
a weighted Euclidean 
inner product. 


Figure 6.1.1 


Remark It may seem odd that the “unit circle" in the second part of the last example turned out to have an elliptical shape. 
This will make more sense if you think of circles and spheres in general vector spaces algebraically (||u|| = 1) rather than 
geometrically. The change in geometry occurs because the norm, not being Euclidean, has the effect of distorting the space that 
we are used to seeing through “Euclidean eyes.” 


Inner Products Generated by Matrices 


The Euclidean inner product and the weighted Euclidean inner products are special cases of a general class of inner products 
on R” called matrix inner products. To define this class of inner products, let и and v be vectors in R” that are expressed in 
column form, and let A be an nvertible у x у matrix. It can be shown (Exercise 31) that if y . y is the Euclidean inner product 
on R”, then the formula 


{u,v} = Au: Av (4) 


also defines an inner product; it is called the inner product on R" generated by A. 


Recall from Table 1 of Section 3.2 that if и and v are in column form, then үү. y can be written as у y from which it follows 
that 4 can be expressed as 


[u. y) = (Ау) T du 


or, equivalently as 


lu, v| = v7 A7 Au (5) 


EXAMPLE 4 Matrices Generating Weighted Euclidean Inner Products + 


The standard Euclidean and weighted Euclidean inner products are examples of matrix inner products. The 
standard Euclidean inner product on R" is generated by the у x з identity matrix, since setting 4 — 7 in Formula 
4 yields 


(u, vj = а у= гу 


and the weighted Euclidean inner product 
(u, v) = ну» + w3u3v2 + * * * «ЕРУ (6) 


is generated by the matrix 


yw, 0 0... 0 
д—| 9 Уз 0... 0 


0 0 0... fw, 


This can be seen by first observing that 47 4 is the » x » diagonal matrix whose diagonal entries are the weights 
W 1, W2, ..., Wy and then observing that 5 simplifies to 6 when A is the matrix in Formula 7. 


(7) 


EXAMPLE 5 Example 1 Revisited < 


Every diagonal matrix with positive diagonal 
entries generates a weighted inner product. 
Why? 


The weighted Euclidean inner product (u, v) = Зин уу | + 2u3v2 discussed in Example 1 is the inner product on 
р? generated by 
үз 0 
А= 


0 у 


Other Examples of Inner Products 


So far, we have only considered examples of inner products оп R". We will now consider examples of inner products on some 
of the other kinds of vector spaces that we discussed earlier. 


EXAMPLE 6 An Inner Product on Mnn <“ 


If U and V are » x » matrices, then the formula 


(о, 4 =1(U77) (8) 


defines an inner product on the vector space M „y (see Definition 8 of Section 1.3 for a definition of trace). This 
can be proved by confirming that the four inner product space axioms are satisfied, but you can visualize why 
this is so by computing 8 for the 2 x 2 matrices 


uj u2 УІ V2 
u-[5 va and r-|[; el 


|u, 4 = (077) = ну» Huaya + uav3 -- u4va4 


This yields 


which is just the dot product of the corresponding entries in the two matrices. For example, if 
12 —1 0 
= V m 
de aperia 
(U, V1 2 1( — 1) + 2(0) + 3(3) +4(2) = 16 


The norm of a matrix U relative to this inner product is 


1/2 
IUI —(U, Uy ^ үш ud uiu 


then 


EXAMPLE 7 The Standard Inner Product on Pp «4 


If 

р = 40 + ух t+ anx” and g= bgt hix t ccc bx" 
are polynomials in P, then the following formula defines an inner product on Р,, (verify) that we will call the 
standard inner product on this space: 


(P. qj = 2020 Harbi + * ^ "Бауи (9) 


The norm of a polynomial p relative to this inner product is 


рі = Vip. p) = ya? +a? + - - - +a? 


EXAMPLE 8 The Evaluation Inner Product on Рђ + 


If 
p= р(х) =ag+ayx+ ++ + apx” апа q= а(х) = Бо + ых «+ bnz? 


are polynomials in P,, and if xg, x4, ..., ху are distinct real numbers (called sample points), then the formula 
(р, q) = р(х0)4 (х0) + Piet 7 c c + Pye) (10) 


defines an inner product on P, called the evaluation inner product at xp, x, ..., Ху. Algebraically, this can be 
viewed as the dot product in R” of the n-tuples 

(pizo), Рбх1),--. p(x4)) and (a(xo).q(x)). a0) 
and hence the first three inner product axioms follow from properties of the dot product. The fourth inner 
product axiom follows from the fact that 


[р.р|=[Р(хо)17+ [PE] + + + + + [p@x)]? 20 
with equality holding if and only if 
р(хо) = p(x1) =...= p(y) = 0 


But a nonzero polynomial of degree n or less can have at most n distinct roots, so it must be that р = 0, which 
proves that the fourth inner product axiom holds. 


The norm of a polynomial p relative to the evaluation inner product is 


рі = үүр. р) = У рхо) 12+ [р(х1)]2+ ++ +[р(х„)]? (11) 


EXAMPLE 9 Working with the Evaluation Inner Product + 


Let P4 have the evaluation inner product at the points 
xg— —2, x4 —0, andx?;—2 
Compute (p, 9} and ||p|| for the polynomials p = р(х) = x? and q—ag(x)—l-4x. 


Solution It follows from 10 and 11 that 
(р, 9} 2 2( — 2)4(— 2) + р(0)4(0) + р(2)4(2) = (4)( 1) + (0CD + (00) =8 


р = Y [p(xo)]? + (pr) 1? + [pd 1? = ¥ ipi- 21? + [pO]? + [22]? 
= 4? 402 4.42 = 32 =4y2 


CALCULUS REQUIRED 
EXAMPLE 10 An Inner Product on C[a, b] < 
Let f = f (х) and g = g(x) be two functions in C[a, b] and define 
(f. = [овоо dx (12) 


We will show that this formula defines an inner product on C [a, 5] by verifying the four inner product axioms 
for functions f = f (x), g= g(x), and h = (x) in C[a, 2]: 


i (f. g= [ sme) а= [sero dx = |g.f 
which proves that Axiom 1 holds. 
р (f+eh} = [vo + g(x) (x) dx 
= [ (х) (х) dx + T g(x)h(x) dx 
= if. hj + (g, hj Í 


which proves that Axiom 2 holds. 


3. 
(xf, g} = [ kf G)g(x) dx =k [ f (х)ш(х) dx = kf. g 
a a 
which proves that Axiom 3 holds. 
4. Iff = f (x) is any function in C[a, 5], then 
i.n [ 7*0 dx >0 (13) 
a 


since f х) > 0 for all x in the interval [a, b]. Moreover because fis continuous on [a, 2], the equality 


holds in Formula 13 if and only if the function fis identically zero on [a, è], that is, if and only if f = 0; and 
this proves that Axiom 4 holds. 


CALCULUS REQUIRED 


EXAMPLE 11 Norm of a Vector in C[a, bh] < 


If C [a, 5] has the inner product that was defined in Example 10, then the norm of a function f = f (x) relative 


to this inner product is 
IfI = (f, £}? = ү [Pe dx (14) 
A 2 


and the unit sphere in this space consists of all functions f in C [a, 5] that satisfy the equation 


[^o йх = 1 


Remark Note that the vector space Р,, is а subspace of C[a, b] because polynomials are continuous functions. Thus, 
Formula 12 defines an inner product on P. 


Remark Recall from calculus that the arc length of a curve y = f (x) over an interval [a, 5] is given by the formula 


r= ffi b [Flay] ax (15) 


Do not confuse this concept of arc length with ||f ||, which is the length (norm) of f when f is viewed as a vector in C'[a, 5]. 
Formulas 14 and 15 are quite different. 


Algebraic Properties of Inner Products 


The following theorem lists some of the algebraic properties of inner products that follow from the inner product axioms. This 
result is a generalization of Theorem 3.2.3, which applied only to the dot product on 2”. 


THEOREM 6.1.2 


If u, v, and w are vectors in a real inner product space V, and if is a scalar, then 


(u, v} + (u w) 
() (=т=) = (0,7) – (uw) 
(d) {ч — v, Ww} = {и, w|— v, wW) 


Proof We will prove part (5) and leave the proofs ofthe remaining parts as exercises. 


(u, v +w) (v +w, u} [By symmetry] 
(v, uj + (w, u) [By additivity] 


(и, vj + (о, w) [By symmetry] 


I 


The following example illustrates how Theorem 6.1.2 and the defining properties of inner products can be used to perform 
algebraic computations with inner products. As you read through the example, you will find it instructive to justify the steps. 


EXAMPLE 12 Calculating with Inner Products + 


(u — 2v, 3u 4 4v) = 0, Zu 4 4v) — (2v, Зи 4 4v) 
= {u, 3u) + {ч, 4v) — (2v. 3u) — (2v. 4v) 
= 3(u, uj | 4(u, v) — 6{v, u} = Siv, v) 
= 3|lul? + 4(u, v} — 6(u, v} — 8111 
= 3|lu|? — 2(u, v} — 8ilvll? 


Concept Review 


Inner product axioms 


Euclidean inner product 


Euclidean n-space 


Weighted Euclidean inner product 


Unit circle (sphere) 


Matrix inner product 


Norm in an inner product space 


Distance between two vectors in an inner product space 


Examples of inner products 


Properties of inner products 


Skills 
* Compute the inner product of two vectors. 
* Find the norm of a vector. 


* Find the distance between two vectors. 


* Show that a given formula defines an inner product. 


* Show that a given formula does not define an inner product by demonstrating that at least one of the inner product 
space axioms fails. 


Exercise Set 6.1 


1. Let (u, v) be the Euclidean inner product on R2, and letu = (1, 1), v = (3, 2), w= (0, — 1), and  — 3. Compute the 
following. 
(а) (№ vj 
(b) (Ev. w) 
(о (+) 
(d) lvl 
(e) (u, v) 
(f Па — žl] 


Answer: 


(a) 5 

(b) —$ 
(c) 73 
(d) y13 
(e) y5 
( үзэ 


. Repeat Exercise 1 for the weighted Euclidean inner product (u, v) = Zujv + 3u3v3. 


uU N 


. Let (u, v) be the Euclidean inner product on RÀ, and letu = (3, — 2), v= (4, 5), w= ( — 1, 6), and  — — 4. Verify the 
following. 


RON и (mtd 


Answer: 


(a) 2 
(b) 11 
(с) =13 
(d) —8 
(e) 0 


A 


. Repeat Exercise 3 for the weighted Euclidean inner product (u, v) = 44у] + 50202. 


л 


` Let (а, v) be the inner product оп R? generated by E || and let u = (2, 1), v= ( = 1, 1), w= (0, — 1). Compute the 


following. 


(а) (wu. Vj 

(b) (v. ж) 

(c) {u + ¥, w) 
(a) lvl] 

(e) d(v,w) 
(0 lv — wl? 


Answer: 


(a) —5 
(b) 1 
(с) —7 
(d) 1 
(e) 1 


(f) 1 
; А i 2 1 0 
Repeat Exercise 5 for the inner product on g^ generated by "Ee 


7. Compute (u, v) using the inner product in Example 6. 


(a. |3 -2 _|=-1 3 
а= |; sb Ыы 1 | 


€ [12]. [46 
a= 3 Кыр 1 


Answer: 


(a) 3 
(b) 56 


8. Compute (P. q) using the inner product in Example 7. 
(a) p= -24x4 3х2, q—4 — 7x? 
(b) p= —5-+Е2х +х2,4=3 { ^re А52 


9. (a) Use Formula 4 to show that (u, v) = 90у + 4u7v3 is the inner product on р? generated by 
3 0 
3] 
(b) Use the inner product in part (а) to compute (u, v) ifu = ( — 3, 2) and v = (1, 7). 
Answer: 
(b) 29 


10. (a) Use Formula 4 to show that 
(u, v) = 50у] — йу — u5v1 + 1025v5 


x 


is the inner product on 22 generated by 


1 


- 


12. 


13. 


14. 


15. 


16. 


17. 


(b) Use the inner product in part (a) to compute (u, v) ifu = (0, = 3) and v = (6, 2). 


. Let u = (u1, из) and v = (у, уз). In each part, the given expression is an inner product on R?. Find a matrix that 


generates it. 
(a) (u, v) = 3ujv1 + 54272 
(b) (u, v) = 40у + билу? 


Answer: 


(a) | {3 0 
0 ¥5 
(b) |2 9 
Let P^ have the inner product in Example 7. In each part, find |[p ||. 
(а) p= —2--3x + 2х2 
(b р=4- 3х2 


Let M 4; have the inner product in Example 6. In each part, find || А||. 


(a) 4, |-2 5 
ET 


() , [0 0 
ЧІ 


Answer: 
(a) 74 
(b) 0 
Let P4 have the inner product in Example 7. Find d (p, 9). 
p-3-x4 х2, q—24 5x? 


Let M 4; have the inner product in Example 6. Find d (A, 2). 


COMNIS PES E 
ao 42116] 


Answer: 


(a) {105 
(b) 47 


Let P4 have the inner product of Example 9, and let p= 1 + x 4 x? and q—1- 2x7. Compute the following. 


(a) (P. 9) 
(b) ЇЇРЇ 
(с) (Pp, q) 


Let Рз have the evaluation inner product at the sample points 


xgo=—1, x1 =0, x2= 1, x3—2 


18. 


19. 


20. 


21. 


Find (р, 9} and ||p|| for p = x + x? and q=1+ x^. 
Answer: 


(p. 4) = 50, [р = 6/3 
In each part, use the given inner product on R? to find |х|, where w= ( — 1, 3). 
(a) the Euclidean inner product 


(b) the weighted Euclidean inner product Га, v) = 3uv + 207У2, where u = (#1, 22) and v = (v1, v2) 


ie 


Use the inner products in Exercise 18 to find d (u, v) for u = ( — 1, 2) and v = (2, 5). 


(c) the inner product generated by the matrix 


Answer: 


(a) 3y2 
(b) 305 
(c) 3y13 


Suppose that u, v, and w are vectors such that 
(u, v) m (v. w) = —3, (о, w) —5 
lull = 1, Ivi — 2. iwl =7 
Evaluate the given expression. 
(a) {u+ у, у dw) 
(b) (2v —w, Зу + Zw) 
(c) {u-v— 2w, 4u + v) 


(d) 19+ vll 
(e) ll2w — vll 
(f) Па 2v + 4w]| 


Sketch the unit circle іп 22 using the given inner product. 


4 16 
(b) (U, v) = Zutv1 + uv? 


(a) lu, v) = ly» + sn, 


Answer: 


(a) 


(b) 


23. 


24. 


25. 


26. 


27. 


28. 


Figure Ex-22 


Let u = (#1, 23) and v = (у, уз). Show that the following are inner products on R? by verifying that the inner product 
axioms hold. 

(а) (u, v) = 3utv1 + Suave 

(b) (u, v) = 44у] Huy +412 + 40272 


Answer: 


0 1 
-] 0 


Let u = (u1, 13, из) and v = (v, уз, v3). Determine which of the following are inner products on 23. For those that are 


Еог = | | then (И, V = — 2 < 0, so Axiom 4 fails. 


not, list the axioms that do not hold. 

(а) (u, v) = u1v1 + usv3 

(b) lu, y) = uty? =+ и2у2 + и2у2 

(с) (u, v) = Zutv1 + uv2 + 44373 
( 


u, V 
u, Vj — uiv, —u3v2 + u3v3 
Show that the following identity holds for vectors in any inner product space. 
2 2 2 2 
[т + vllt + [а — 1 = 21101 + 2l] vl 

Answer: 
(a) 228 

15 
(b) 0 


Show that the following identity holds for vectors in any inner product space. 


Цам lu- v? 
[v.v] = 10+ vi? -fluvi 


и] uj У] V2 ү : 
Let 7 = из ид апаў = v3 val Show that (U, V = uiv, + uv3 + зуд + u4v4is not an inner product on M 55. 
Calculus required Let the vector space P4 have the inner product 


1 
р, 9 = f. р(х) (х) dx 


29. 


30. 


31. 
32. 


(а) Find ||p|| for p= 1, P = x, and p = х2. 


(b) Find d (p, q) ifp = 1 and 9 = х. 


Calculus required Use the inner product 


1 
».a|- | rte) ax 


on P5, to compute (D, 9). 
(a) p=1—x4 x*4 5x7, q= х — 3x? 
(b p2x—5x2,q—2 + 8x? 


Calculus required In each part, use the inner product 


1 
гаје | se) ax 


on C'[O, 1] to compute (f, в). 
(a) Ё = cos2nx, а = sin2ax 
(b f =x, g=e" 


(c) f —tantz, g=1 


Prove that Formula 4 defines an inner product on 2”. 


The definition of a complex vector space was given in the first margin note in Section 4.1. The definition of a complex 
inner product on a complex vector space V is identical to Definition | except that scalars are allowed to be complex 
numbers, and Axiom | is replaced by (u, v) = (ү, и). The remaining axioms аге unchanged. A complex vector space with 


a complex inner product is called a complex inner product space. Prove that if V is a complex inner product space then 
(u, kv) == k(u, v). 


True-False Exercises 


In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 


(a) The dot product on 22 is an example of a weighted inner product. 


Answer: 


True 


(b) The inner product of two vectors cannot be a negative real number. 


Answer: 


False 


(c) (ч, v -- w) = (v, uj + (w, uj. 


Answer: 


True 


(d) (iu, kv) = ilu, Ji 


Answer: 


True 


(e) If (u, v) = 0, then y= Q or y = 0. 
Answer: 


False 


(D If |||? = 0, then y 0. 
Answer: 


True 


(g) If A is ап» x x matrix, then (u, v) = Au - Av defines an inner product on А”. 
Answer: 


False 
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6.2 Angle and Orthogonality in Inner Product 
Spaces 


In Section 3.2 we defined the notion of “angle” between vector in R”. In this section we will extend this idea 
to general vector spaces. This will enable us to extend the notion of orthogonality as well, thereby setting the 
groundwork for a variety of new applications. 


Cauchy-Schwarz Inequality 
Recall from Formula 20 of Section 3.2 that the angle 9 between two vectors u and v in R” is 


= cog [Luv 
cos (TaT) Б 


We were assured that this formula was valid because it followed from the Cauchy-Schwarz inequality 
(Theorem 3.2.4) that 


—1< 11 «1 
^а > (2) 


as required for the inverse cosine to be defined. The following generalization of Theorem 3.2.4 will enable us 
to define the angle between two vectors in any real inner product space. 


THEOREM 6.2.1 Cauchy-Schwarz Inequality 
If u and v are vectors in a real inner product space V, then 


fa, v4] < [lull lv] (3) 


Proof We warn you in advance that the proof presented here depends on a clever trick that is not easy to 
motivate. 


In the case where ц = Q the two sides of 3 are equal since (u, v) and ||u|| are both zero. Thus, we need only 
consider the case where ц + 0. Making this assumption, let 
а = {u, u), b— 2(u, v), c= (v. v) 


and let ¢ be any real number. Since the positivity axiom states that the inner product of any vector with itself is 
nonnegative, it follows that 


0< (£u Hv, £u4 v) = lu, цу? | а, v) | |» v| 


= ш+ы+с 


This inequality implies that the quadratic polynomial д2 + bf + c has either no real roots or a repeated real 
root. Therefore, its discriminant must satisfy the inequality 5 2 4ac < 0. Expressing the coefficients a, 5, 


and c in terms of the vectors и and v gives 4{u, vy -= álu, uly, y) < Ü or, equivalently, 


(а, vj? < lu, а}, y) 
Taking square roots of both sides and using the fact that (и, u) and (у, v) are nonnegative yields 


(u. y) 1/2 


which completes the proof. 


< (u, uj 2tv, v} < |lull|l¥ll 


orequivalently (u. y) 


The following two alternative forms of the Cauchy-Schwarz inequality are useful to know: 


(о, у} < lu, uj. v| (4) 


(u, vY? < ||? (5) 


The first of these formulas was obtained in the proof of Theorem 6.2.1, and the second is a variation of the 
first. 


Angle Between Vectors 


Our next goal is to define what is meant by the “angle” between vectors in a real inner product space. As the 
first step, we leave it for you to use the Cauchy-Schwarz inequality to show that 


u, V 
-1<—— <1 6 
- ШУ ^ ©) 


This being the case, there is a unique angle 9 in radian measure for which 


v 


п, 
cosh = 
Iuli [vl] 


and 0<0<т (7) 


(Figure 6.2.1). This enables us to define the angle 0 between u and v to be 


_ EE (о, v) 
= | Tull | ® 


у 


Figure 6.2.1 


EXAMPLE 1 Cosine of an Angle Between Two Vectors in к^ 4 


Let 24 have the Euclidean inner product. Find the cosine of the angle 9 between the vectors 
u= (4, 3, 1, = 2) and v = ( = 2, 1, 2, 5). 


Solution We leave it for you to verify that 


lul = 730, [Iv] =¥18, and (u ғ) = -9 


from which it follows that 


Да у) _ 9 3 


Bois a15 


cos b = 


Properties of Length and Distance in General Inner Product Spaces 


In Section 3.2 we used the dot product to extend the notions of length and distance to 2”, and we showed that 
various familiar theorems remained valid (see Theorem 3.2.5, Theorem 3.2.6, and Theorem 3.2.7). By making 
only minor adjustments to the proofs of those theorems, we can show that they remain valid in any real inner 
product space. For example, here is the generalization of Theorem 3.2.5 (the triangle inequalities). 


THEOREM 6.2.2 


If u, v, and w are vectors in a real inner product space V, and if k is any scalar, then: 
(а) u+ v|| € |[u]| + ||v|| [Triangle inequality for vectors] 


(b) d(u, v) < d(u, w) + d Ow, v) [Triangle inequality for distances] 


Proof (a) 


lu + v|? ={u+v, u+ v) 
(u, uj | 2(u, Y! (v. v) 

< (u, uj-- iu, vi|-- (v. v) [Property of absolute value] 
© ч, ч -2]ulllvil--iv. vj — [By @)] 

lall? + 2ш + Ill]? 

= (ч + Iv)? 


Taking square roots gives ||u + v|| < |u|] + ||v |]. 


Proof (b) Identical to the proof of part (5) of Theorem 3.2.5. 


Orthogonality 


Although Example 1 is a useful mathematical exercise, there is only an occasional need to compute angles in 
vector spaces other than 22 and RŽ. A problem of more interest in general vector spaces is ascertaining 
whether the angle between vectors 15 т / 2. You should be able to see from Formula 8 that if u and v are 
nonzero vectors, then the angle between them is @ = т / 2 if and only if (u, v) = 0. Accordingly, we make the 
following definition (which is applicable even if one or both of the vectors is zero). 


DEFINITION 1 


Two vectors u and v in an inner product space are called orthogonal if (u, v) — 0. 


As the following example shows, orthogonality depends on the inner product in the sense that for different 
inner products two vectors can be orthogonal with respect to one but not the other. 


EXAMPLE 2 Orthogonality Depends on the Inner Product + 


The vectors u — (1, 1) and v — (1, — 1) are orthogonal with respect to the Euclidean inner 
product on 22, since 
а-у=(1)(1)+ (D(-1)20 
However, they are not orthogonal with respect to the weighted Euclidean inner product 
(а, v) = 3u1v1 + 20272, since 
fu, v = 3(1)(1) + 201)(— 1) 2120 


EXAMPLE 3 Orthogonal Vectors in M22 «@ 


If M 5; has the inner product of Example 6 in the preceding section, then the matrices 
1 0 0 2 
v= d Vz 


(U, V} = 100) + 0(2) + 1(0) + 1(0) =0 


are orthogonal, since 


CALCULUS REQUIRED 


EXAMPLE 4 Orthogonal Vectors in P2 <4 


Let P^ have the inner product 


and let P — X and q — x?. Then 


| 1/2 i 1/2 
Il - (p. py? - f xx dx =| f vax) = {2 
-1 -1 3 
1 1/2 i 1/2 
lal = (a. 9)! - f x?x? dx s Sa = f2 
-i - 3 
| 2 3 
(P. q} =f XX da= | x х = 0 
ЕЕ | —1 


Because (р, q) = 0, the vectors р = X and q = x“ are orthogonal relative to the given inner 


product. 


In Section 3.3 we proved the Theorem of Pythagoras for vectors in Euclidean n-space. The following theorem 
extends this result to vectors in any real inner product space. 


THEOREM 6.2.3 Generalized Theorem of Pythagoras 


If u and v are orthogonal vectors in an inner product space, then 


2 2 2 
la + vil" = [ull + Iiv 


Proof The orthogonality of и and v implies that (u, v) = 9, so 


|u 4 ү|? = {а+у,и+у\= ч? + 2(u, жу + E 


2 2 
lull” + [®|| 


CALCULUS REQUIRED 


EXAMPLE 5 Theorem of Pythagoras in P2 + 


In Example 4 we showed that p = X and = x? are orthogonal with respect to the inner product 


1 
(P. y- f. Р(х) (х) dx 
on Рз. It follows from Theorem 6.2.3 that 
2 2 2 
[їр + 41 = ||р||° + llall 


Thus, from the computations in Example 4, we have 


2 2 
2_ {12 21212.16 


We can check this result by direct integration: 


lp + al^ — (p +q. p + q- f. (x x3 (xx?) ах 


1 1 1 
=f 2+2] as | х^ах = © + 0+ 2 = 16 


Orthogonal Complements 


In Section 4.8 we defined the notion of an orthogonal complement for subspaces of R”, and we used that 
definition to establish a geometric link between the fundamental spaces of a matrix. The following definition 
extends that idea to general inner product spaces. 


DEFINITION 2 


If W is a subspace of an inner product space V, then the set of all vectors in V that are orthogonal to 
every vector in W is called the orthogonal complement of W and is denoted by the symbol py ^. 


In Theorem 4.8.8 we stated three properties of orthogonal complements in R”. The following theorem 
generalizes parts (a) and (b) of that theorem to general inner product spaces. 


THEOREM 6.2.4 


If W is a subspace of an inner product space V, then: 
(a) W is a subspace of V. 
(b) г = (0). 


Proof (a) The set Wy + contains at least the zero vector, since (0, w= 0 for every vector w in W. Thus, it 
remains to show that Jj + is closed under addition and scalar multiplication. To do this, suppose that u and v 
are vectors in Jj +, so that for every vector w in W we have (u, w) = Ü and (v, w) = 0. It follows from the 
additivity and homogeneity axioms of inner products that 


{ч | v, w) = (и, м) | (м, w} = 0 -O=0 
(£u, w) = k(u, w} = (0) = 0 


which proves that u + y and y аге in ў L 


Proof (b) Ifv is any vector in both W and [7 +, then v is orthogonal to itself; that is, (v, v) = 0. It follows 
from the positivity axiom for inner products that y = Q. 


The next theorem, which we state without proof, generalizes part (c) of Theorem 4.8.8. Note, however, that 
this theorem applies only to finite-dimensional inner product spaces, whereas Theorem 6.2.5 does not have 
this restriction. 


THEOREM 6.2.5 


Theorem 6.2.5 implies that in a finite- 
dimensional inner product space 
orthogonal complements occur in pairs, 
each being orthogonal to the other (Figure 
6.2.2). 


Theorem 6.2.5 If W is a subspace of a finite-dimensional inner product space V, then the orthogonal 
complement of Jy ^ is W; that is, 


Figure 6.2.2 Bach vector in W is orthogonal to each vector in W-- and conversely 


In our study of the fundamental spaces of a matrix in Section 4.8 we showed that the row space and null space 
of a matrix are orthogonal complements with respect to the Euclidean inner product on R" (Theorem 4.8.9). 
The following example takes advantage of that fact. 


EXAMPLE 6 Basis for an Orthogonal Complement + 


Let W be the subspace of RÊ spanned by the vectors 
мү =(1,3, = 2, 0, 2, 0), мә = (2,6, = 5, = 2,4, = 3), 
w3 = (0,0, 5, 10, 0, 15), зд = (2, 6, 0, 8, 4, 18) 


Find a basis for the orthogonal complement of W. 


Solution The space W is the same as the row space of the matrix 


13-2 02 0 
26 —5 —2 4 —3 
00 5 10 0 15 
26 0 84 18 


А= 


Since the row space and null space of А are orthogonal complements, our problem reduces to 
finding a basis for the null space of this matrix. In Example 4 of Section 4.7 we showed that 


-3 =] —2 

1 0 0 

0 —2 0 

ч=| of Y2=! |, з= g 
0 0 1 

0 0 0 


form a basis for this null space. Expressing these vectors in comma-delimited form (to match 
that of w1, w2, мз, and W4), we obtain the basis vectors 
v;—(—2,1,0,0,0,0, v5—(—4,0, = 2, 1,0, 0), we=(—2,0,0,0, 1, 0) 


You may want to check that these vectors are orthogonal to W1, W2, №3, апа W4 by computing 
the necessary dot products. 


Concept Review 

* Cauchy-Schwarz inequality 
* Angle between vectors 

* Orthogonal vectors 


* Orthogonal complement 


Skills 


* Find the angle between two vectors in an inner product space. 
* Determine whether two vectors in an inner product space are orthogonal. 


* Finda basis for the orthogonal complement of a subspace of an inner product space. 


Exercise Set 6.2 


1. Let 22, 23, and g^ have the Euclidean inner product. In each part, find the cosine of the angle between u 
and v. 
(a) u—íl, = 3), v= (2, 4) 
(b) u= (= 1,0), v= (3, 8) 
(с) u—í(—1,5,2,v—(2,4, —9) 
(d) u= (4,1,8), v—(1,0, —3) 
(е) u=(1,0, 1,0), v=(-3, —3, —3, —3) 
( w— (2,1,7, —1), v= (4,0, 0, 0) 


Answer: 
(a) — 1 
ү2 
(b) d. 
73 
(c) 0 
(d ——20 . 
910 
(e) -l 
V2 
( —2— 


2. Let P^ have the inner product in Example 7 of Section 6.1 . Find the cosine of the angle between pand q. 
(à p= – 1+ 5х + 2х2, а= 2 4- 4x - 9х? 
(b p2x—22,q—7 + 3x + 3x7 


3. Let Му; have the inner product in Example 6 of Section 6.1 . Find the cosine of the angle between А and 
B. 


(a l2 5 EZ 
4-[i 5} [1 o 


(b) ,_| 2 4] ,_[-3 1 
4-| 3 3}2=| 4 4 


Answer: 


(а) _19 
10/7 
(b) 0 
4. In each part, determine whether the given vectors are orthogonal withrespect to the Euclidean inner 
product. 
(а) u=(—1, 3,2), у= (4, 2, = 1) 
(b u-(—2, 2, – 2), v=(1, 1, 1) 
(c) u= (u1, #2, из), v= (0, 0, 0) 
(d) u=(—4, 6, — 10, 1), v= (2, 1, —2, 9) 
(e) u= (0,3, 22,1), v= (5,2, = 1, 0) 
(f) u= (a, 5), v=(— b, a) 


5. Show that p = 1 = x + 2х2 and а= 2x + х? are orthogonal with respect to the inner product in Exercise 


2. 
6. Let 
Z 4 
gen 
Which of the following matrices are orthogonal to A with respect to the inner product in Exercise 3? 
(a) | -3 ;| 
0 2 


[1 1 
0 -1 


7. Do there exist scalars k and / such that the vectors u = (2, &, 6), v= (2, 5, 3), and w= (1, 2, 3) аге 
mutually orthogonal with respect to the Euclidean inner product? 


Answer: 


No 
8. Let 23 have the Euclidean inner product, and suppose that u = (1, 1, — 1) and v = (6, 7, — 15). Finda 
value of k for which ||iu + v|| = 13. 
9. Let R? have the Euclidean inner product. For which values of k are u and v orthogonal? 
(a) u= (2, 1, 3), = (1, 7,4) 
(b) u= (k, &, 1), v= (k, 5, 6) 


Answer: 


(a) k= —3 
(b) k= —2, – 3 

10. Let 24 have the Euclidean inner product. Find two unit vectors that are orthogonal to all three of the 
vectors u = (2, 1, = 4, 0), v= (= 1, = 1, 2, 2), and w= (3, 2, 5, 4). 


11. In each part, verify that the Cauchy—Schwarz inequality holds for the given vectors using the Euclidean 
inner product. 


(a) u= (3, 2), v— (4, — 1) 
(b u=(—3, 1,9), v= (2, 21, 3) 
(с) ч=(—4,2,1), з= (8, —4, —2) 
(d u= (0, – 2, 2, 1), vz(-1, 21,1, 1) 
12. In each part, verify that the Cauchy-Schwarz inequality holds for the given vectors. 


(a) u= ( — 2, 1) and v = (1, 0) using the inner product of Example 1 of Section 6.1 . 


(b) i= Е 1 and = E ; using the inner product in Example 6 of Section 6.1 . 


(c) p= – 1 + 2x 4 x? and q—2- Ax? using the inner product given in Example 7 of Section 6.1 . 


13. Let 24 have the Euclidean inner product, and let u = ( — 1, 1, 0, 2). Determine whether the vector u is 


orthogonal to the subspace spanned by the vectors wy = (0, 0, 0, 0), w2 = (1, = 1, 3, 0), and 
w3 = (4, 0, 9, 2). 


Answer: 


No 


In Exercises 14—15, assume that 2” has the Euclidean inner product. 


14. Let W be the line in 22 with equation у = 2x. Find an equation for 7 L. 


15. (a) Let W be the plane in д3 with equation x — 2y = 3z = 0. Find parametric equations for py +. 
(b) Let W be the line in R? with parametric equations 
x—2í, у= —5t z-—d4t 
Find an equation for Jj +. 
(c) Let W be the intersection of the two planes 
х+у+2=0 and х—у+2>=0 
in #3. Find an equation for Jy ^. 


Answer: 


(a) x—-ft,y-—-—2tz--—3 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 
27. 


(b) 2x = 5y +4z=0 

(с) Х—2=0 

Find a basis for the orthogonal complement of the subspace of R” spanned by the vectors. 

(a) жу = (1, = 1, 3), з = (5, —4, = 4), уз = (7, —6,2) 

(b) vy = (2,0, = 1), v2 = (4, 0, = 2) 

(c) vy = (1,4, 5, 2), v2 = (2, 1, 3, 0), жз = (= 1, 3, 2, 2) 

(d) vy = (1,4, 5, 6, 9), v2 = (3, = 2, 1,4, = 1), v3 = (= 1,0, 21, 22, = 1), v4 (2, 3, 5, 7, 8) 


Let V be an inner product space. Show that if u and v are orthogonal unit vectors in V, then ||u — v|| = V2 


Let V be an inner product space. Show that if w is orthogonal to both Uj and из, then it is orthogonal to 
ku, + &2u3 for all scalars &, and £3. Interpret this result geometrically in the case where V is R? with 
the Euclidean inner product. 


Let V be an inner product space. Show that if w is orthogonal to each of the vectors uj, uz, ..., Uy, then it 
is orthogonal to every vector in span {ц}, из, ..., Uy} . 


Let (v1, V2,..., V.) bea basis for an inner product space V. Show that the zero vector is the only vector 
in V that is orthogonal to all of the basis vectors. 


Let {wy , W2, -- Wg} bea basis for a subspace W of V. Show that pF ^ consists of all vectors in V that are 
orthogonal to every basis vector. 


Prove the following generalization of Theorem 6.2.3: If vj, уз, ..., v, are pairwise orthogonal vectors in 
an inner product space V, then 


2 2 2 2 
Ilvir vac c ctv = |101 + [у +++ + [у] 
Prove: If u and v are » x 1 matrices and A is an » м з matrix, then 


v7 AT Ай г u A7 Au | [v7 AT Ау 
(4 4] < (aa д 


Use the Cauchy-Schwarz inequality to prove that for all real values of a, b, and 8, 


(acosB + bsinB)? < a? +b? 


Prove: If |, W2, ..., Wy are positive real numbers, and if u = (21, 22, ..., Чу) and v= (v1, v3, ..., Уу) 
are any two vectors in R”, then 
|w121v1 H wuwa t * * * E уйы] 
1/2 1/2 
< (өш і wus reer Yat) (wivi H wav? Ext wava | 


Show that equality holds in the Cauchy-Schwarz inequality if and only if u and v are linearly dependent. 
Use vector methods to prove that a triangle that is inscribed in a circle so that it has a diameter for a side 
must be a right triangle. [Hint: Express the vectors дд and pe in the accompanying figure in terms of u 
andv.] 


B 


v 


Figure Ex-27 


28. As illustrated in the accompanying figure, the vectors u — (1, үз | and v = (- 1; үз | have norm 2 and 


an angle of 60? between them relative to the Euclidean inner product. Find a weighted Euclidean inner 
product with respect to which u and v are orthogonal unit vectors. 


Figure Ex-28 


29. Calculus required Let f (x) and g(x) be continuous functions on [0, 1]. Prove: 


(a) 1 2 1 1 

` | Í FE) «| <| Í f(x) а Í gla) «| 

(b) 2 
nx a] [ome] 


[Hint: Use the Cauchy-Schwarz inequality.] 


1 1/2 
‚| ji gia) | 
0 


30. Calculus required Let С [0, т] have the inner product 
е) = [sec as 
and let f p, = соѕих (и = 0, 1, 2, ...). Show that if x + j, then f and f ; are orthogonal vectors. 


31. (a) Let W be the line у = x in an xy-coordinate system in 22. Describe the subspace Jj ^. 
(b) Let W be the y-axis in an xyz-coordinate system in 27. Describe the subspace Jj +. 
(c) Let W be the yz-plane of an xyz-coordinate system in 23. Describe the subspace Jj +. 


Answer: 


(a) The line y = = 
(b) The xz-plane 


(c) The x-axis 


32. Prove that Formula 4 holds for all nonzero vectors u and v in an inner product space V. 


True-False Exercises 
In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 
(a) If u is orthogonal to every vector of a subspace W, then y — Q. 

Answer: 


False 


(b) If u is a vector in both W and Wy +, then y = 0). 
Answer: 


True 


(c) If u and v are vectors in pF ^, then u + y is in W +. 
Answer: 


True 


(d) If u is a vector in JW ^ and k is a real number, then iy is in Jy ^. 
Answer: 


True 


(e) If u and v are orthogonal, then |(u, v)| = [[ul|||v]l. 
Answer: 


False 


(f) If u and v are orthogonal, then ||u + v|| = || + ||v|]. 
Answer: 


False 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


6.3 Gram- Schmidt Process; QR-Decomposition 


In many problems involving vector spaces, the problem solver is free to choose any basis for the vector space that 
seems appropriate. In inner product spaces, the solution of a problem is often greatly simplified by choosing a basis 
in which the vectors are orthogonal to one another. In this section we will show how such bases can be obtained. 


Orthogonal and Orthonormal Sets 


Recall from Section 6.2 that two vectors in an inner product space are said to be orthogonal if their inner product is 
zero. The following definition extends the notion of orthogonality to sets of vectors in an inner product space. 


DEFINITION 1 


A set of two or more vectors in a real inner product space is said to be orthogonal if all pairs of distinct 
vectors in the set are orthogonal. An orthogonal set in which each vector has norm 1 is said to be 
orthogonal. 


EXAMPLE 1 An Orthogonal Set in R 4 


Let 
u; = (0, 1,0), u;—(1,0,1) u;3—(1,0, = 1) 


and assume that 2 has the Euclidean inner product. It follows that the set of vectors 
S= (uj, uz, из} is orthogonal since (uj, uz} = (uj, u3} = (02, 03} = 0. 


If v is a nonzero vector in an inner са space, then it follows from Theorem 6.1.15 with & = ||v|| that 


Ila = тті“ v|| = xlvii = 1 
ini lvl ini 


from which we see that multiplying a nonzero vector by the reciprocal of its norm produces a vector of norm 1. This 
process is called normalizing v. It follows that any orthogonal set of nonzero vectors can be converted to an 
orthonormal set by normalizing each of its vectors. 


EXAMPLE 2 Constructing an Orthonormal Set + 


The Euclidean norms of the vectors in Example 1 are 


lul — 1, ual — V2. lasl =y2 


Consequently, normalizing u4, U2, and из yields 


е Таа 2 


We leave it for you to verify that the set S= (v4, v2, v3} is orthonormal by showing that 
(vi, v2) = (vi, v3) = (v2, v3) = 0. and |Ivil| = 119211 = lvl = 1 


In 22 any two nonzero perpendicular vectors are linearly independent because neither is a scalar multiple of the 
other; and in 7 any three nonzero mutually perpendicular vectors are linearly independent because no one lies in 
the plane of the other two (and hence is not expressible as a linear combination of the other two). The following 
theorem generalizes these observations. 


THEOREM 6.3.1 


If S — (vi, v2,..., уу) is an orthogonal set of nonzero vectors in an inner product space, then S is linearly 
independent. 


Proof Assume that 


kyvy Hiwat ^ c c +kyv,=0 
To demonstrate that S= (v4, v2, ..., v4) is linearly independent, we must prove that у = &2 =... = Ё, = 0. 
For each V; in S, it follows from 1 that 
(kivi + k2v2 ++ + + + +kyvy, vi) = (0, vi) = 0 
or, equivalently, 
ki(vi, vi) + (v3, Vi) - + + (ч, у) = 0 


From the orthogonality of S it follows that (vj, уу у = 0 when j i, so this equation reduces to 

ki(vi, vi) = 0 
Since the vectors in S are assumed to be nonzero, it follows from the positivity axiom for inner products that 
(vi, Vi) * 0. Thus, the preceding equation implies that each &; in Equation 1 is zero, which is what we wanted to 
prove. 


Since an orthonormal set is orthogonal, and since 
its vectors are nonzero (norm 1), it follows from 
Theorem 6.3.1 that every orthonormal set 15 
linearly independent. 


In an inner product space, a basis consisting of orthonormal vectors is called an orthonormal basis, and a basis 


(1) 


consisting of orthogonal vectors is called an orthogonal basis. А familiar example of an orthonormal basis is the 
standard basis for R" with the Euclidean inner product: 
e; = (1, 0, 0,...0), &52(0,1,0,..,0),..,, e,= (0, 0, 0,.., 1) 


EXAMPLE 3 AnOrthonormal Basis + 


In Example 2 we showed that the vectors 


vı = (0, 1,0), w= " 0, &) and уз = " 0, -k| 


form an orthonormal set with respect to the Euclidean inner product on 23. By Theorem 6.3.1, these 
vectors form a linearlyindependent set, and since 22 is three-dimensional, it follows from Theorem 


4.5.4 that S = (v4, v2, v3) is an orthonormal basis for 23. 


Coordinates Relative to Orthonormal Bases 


One way to express a vector u as a linear combination of basis vectors 
E (vi. Vj. --- Vy) 


is to convert the vector equation 

u-—c,V| Pewa + ^ * * c Cy VR 
to a linear system and solve for the coefficients c1, c2, ..., c4. However, if the basis happens to be orthogonal or 
orthonormal, then the following theorem shows that the coefficients can be obtained more simply by computing 
appropriate inner products. 


THEOREM 6.3.2 


(а) I£ S — (vi, v3,..., v4) is an orthogonal basis for an inner product space V, and if u is any vector in V, 
then 


= 1 2 
2 2 2 
Ivi ll II¥all ж || 


b) If S— (vi, v5,.., V is an orthonormal basis for an inner product space V, and if u is any vector in V, 
1 2 n y 
then 


u= (u, ууу + (u, v3)jv2 + * * * + (u, ууз» (3) 


Proof (a) Since 5 = (v4, v3,..., v4) isa basis for V, every vector u in V can be expressed in the form 


u-—cVj 6272 "суту 


We will complete the proof by showing that 


(о, vij 


= 2 (4) 
Iv; 
for = 1, 2,..., x. To do this, observe first that 
(а, vi) = (civi + с2У2 + * + c c Cy VR. Vi) 
—cei(vi, vi) ea(va. vi) - ^ + + en Vn, Vi) 
Since 5 is an orthogonal set, all of the inner products in the last equality are zero except the ith, so we have 
2 
lu, | = evi i = cillvill 
Solving this equation for c; yields 4, which completes the proof. 
Proof (b) In this case, ||v;|| = ||v]] =... = ||v]] = 1, so Formula 2 simplifies to Formula 3. 
Using the terminology and notation from Definition 2 of Section 4.4, it follows from Theorem 6.3.2 that the 
coordinate vector of a vector u in V relative to an orthogonal basis 5 = (v1, v2,..., V4) 18 
— f{uvi} (u v2) (о, v4) 
(u)s— 2 › 2 pees 2 (5) 
Ivill^ — IIvall А 
and relative to an orthonormal basis S = (v4, v2,..., v4) is 
(0) 5 = ({u, V1 » (u, v2). сте (u, Vy) (6) 


EXAMPLE 4 A Coordinate Vector Relative to an Orthonormal Basis + 
Let 


5' '5 5° "35 


It is easy to check that 5 — (v, v2, v3) is an orthonormal basis for 2 with the Euclidean inner product. 


sede w= (-4. 0, 5) we a 0, 3] 


Express the vector u = (1, 1, 1) as a linear combination of the vectors in S, and find the coordinate vector 
(u) s- 


Solution We leave it for you to verify that 


u, уу |= 1, 26 -i and (u, vs} = 3 
Therefore, by Theorem 6.3.2 we have 
u=Vv{ -ind iu 
that 1s, 
(1,1,1) = (0, 1,0) -i(- to, i) " AG 0, 


Thus, the coordinate vector of u relative to S is 


s= Qu vih (® уз}. (® ¥3)) = (1, - 3. 2) 


EXAMPLE 5 An Orthonormal Basis from an Orthogonal Basis + 


(а) Show that the vectors 
w1 = (0, 2, 0), wo=(3,0,3), w3—(-—4,0,4) 
form an orthogonal basis for 22 with the Euclidean inner product, and use that basis to find an 
orthonormal basis by normalizing each vector. 


(b) Express the vector u — (1, 2, 4) as a linear combination of the orthonormal basis vectors obtained 
in part (a). 


Solution 
(а) The given vectors form an orthogonal set since 
(wi. w2) = 0, (wi. w3) = 0, {w2, W3) = 0 


It follows from Theorem 6.3.1 that these vectors аге linearly independent and hence form a basis 
for R? by Theorem 4.5.4. We leave it for you to calculate the norms of w4, w2, and W3 and then 


obtain the orthonormal basis 


v= тет (0, 1, 0), "| 


— W31 _ 
Үз 7 "wall -[ (2 ¥2 


(b) It follows from Formula 3 that 
u= (u, vj wj (0, удул + (u, v3]v5 
We leave it for you to confirm that 
{ч, vi) —(1,2,4)-(0,1,0) 22 


and hence that 


(1,24) 200, 1,0) 4 EU: 


ЕУ) 


Orthogonal Projections 


Many applied problems are best solved by working with orthogonal or orthonormal basis vectors. Such bases are 
typically found by starting with some simple basis (say a standard basis) and then converting that basis into an 


orthogonal or orthonormal basis. To explain exactly how that is done will require some preliminary ideas about 
orthogonal projections. 


In Section 3.3 we proved a result called the Prohection Theorem (see Theorem 3.3.2) which dealt with the problem 
of decomposing a vector и in д” into a sum of two terms, W1 and W2, in which W] is the orthogonal projection of u 


on some nonzero vector a and Ww? is orthogonal to w (Figure 3.3.2). That result is a special case of the following 
more general theorem. 


THEOREM 6.3.3 Projection Theorem 


If W is a finite-dimensional subspace of an inner product space V,then every vector u in V can be expressed 
in exactly oneway as 


о = у + Ww (7) 


where w is іп W and w73 is in Jy. 


The vectors W1 and W3 in Formula 7 are commonly denoted by 

w;-—projyu and wj-propgiu (8) 
They are called the orthogonal projection of оп W and the orthogonal projection of u on } +, respectively. The 
vector W3 is also called the component of u orthogonal to W. Using the notation in 8, Formula 7 can be expressed 
as 


u = projy U + projyrt U (9) 


(Figure 6.3.1). Moreover, since projgLu = u — ргоју и, we can also express Formula 9 as 


u = projp U + (и = projg u) (10) 
Ww 
u А 
Proj: U 
0 proj yu w 
Figure 6.3.1 


The following theorem provides formulas for calculating orthogonal projections. 


THEOREM 6.3.4 


Let W be a finite-dimensional subspace of an inner product space V. 


(a) If (v1, v3,..., Vp} is an orthogonal basis for W, and u is any vector in V, then 


qun, 1.” ЖИНИ E 


projg u = 2 ч х (11) 
liv: lvall llv 
(b) If (v1, уз, ..., Vy} is an orthonormal basis for W, and u is any vector in V, then 
projy и = (u, vivi + (u, уруу + * * + + (U, тууу, (12) 


Proof (a) It follows from Theorem 6.3.3 that the vector u can be expressed in the form u = мј ++ w2, where 
мр = projg u is in W and №2 is in W 1; and it follows from Theorem 6.3.2 that the component projg u = wy can be 
expressed in terms of the basis vectors for W as 


А wi, V wW, V wi, V 
progg М =W] = La ур + Кылы: „+ + + = К Cal 8 


2 2 2 (13) 
Iv: ll IIvall [у | 


Since W2 is orthogonal to W, it follows that 
(w2, v1) = (w2, v2) =... = (wo, vy} = 0 
so we can rewrite 13 as 


. (wi Е wo, v1) 1 =H W2, v2} wi “+ W3, vy) 
projg;u-— Wy = Vict үз 4 Vy 


T : 
2 2 2 
Ivi II¥all |ж, || 


or, equivalently, as 


1 2 Vy 
2 2 2 
Iv: Il Ilvall llv» 


Proof (а) In this case, ||v1]| = ||val| =... = ||¥,|] = 1, so Formula 13 simplifies to Formula 12. 


EXAMPLE 6 Calculating Projections — 


Let #3 have the Euclidean inner product, and let W be the subspace spanned by the orthonormal 
vectors vj = (0, 1, 0) and v; = (- i. 0, i) From Formula 12 the orthogonal projection of 
u= (1, 1, 1) on Wis 

projgu = (о, vj vj + (u, v2)v2 


= (1)(0, 1, 0) + 5 0, 3) 


(414-2 
25' ' 25 


The component of и orthogonal to W is 


projg.u—uc-projgru— (1, 1, 1) — [on 1, -35)= Ge 0, 2. 


Observe that ргој» 1 u is orthogonal to both v, and v7, so this vector is orthogonal to each vector іп 
the space W spanned by v and v3, as it should be. 


A Geometric Interpretation of Orthogonal Projections 


If W is a one-dimensional subspace of an inner product space V, say span (a) , then Formula 11 has only the one 
term 


{ч, а) 
а 
2 
Ilall 


In the special case where V is 23 with the Euclidean inner product, this is exactly Formula 10 of Section 3.3 for the 


projy u = 


orthogonal projection ofu along a. This suggests that we can think of 11 as the sum of orthogonal projections on 
“axes” determined by the basis vectors for the subspace W (Figure 6.3.2). 


x 1 
D projywu 


Figure 6.3.2 


The Gram-Schmidt Process 


We have seen that orthonormal bases exhibit a variety of useful properties. Our next theorem, which is the main 
result in this section, shows that every nonzero finite-dimensional vector space has an orthonormal basis. The proof 
of this result is extremely important, since it provides an algorithm, or method, for converting an arbitrary basis into 
an orthonormal basis. 


THEOREM 6.3.5 


Every nonzero finite-dimensional inner product space has an orthonormal basis. 


Proof Let Wbe any nonzero finite-dimensional subspace of an inner product space, and suppose that 

(u,, U3, ..., Uy} is any basis for JW. It suffices to show that W has an orthogonal basis, since the vectors in that basis 
can be normalized to obtain an orthonormal basis. The following sequence of steps will produce an orthogonal basis 
(v1, V2,..., V.) for W: 


Step 1. Let v4 =U}. 


Step 2. As illustrated in Figure 6.3.3, we can obtain a vector v3 that is orthogonal to V1 by computing the 
component of uz that is orthogonal to the space W spanned by v1. Using Formula 11 to perform this 
computation we obtain 


(uz, v1) " 
2 
Ivi 


Of course, if yz = 0, then V2 is not a basis vector. But this cannot happen, since it would then follow from 
the above formula for v2 that 


V2 = U2 — projy, 02 =u? — 


2 2 
vill llu: |l 
which implies that оз is a multiple of Uj, contradicting the linear independence of the basis 
S= (uj, u5,.., Us). 


-duevi). _ А vi) 


Va =U, —projy, us 
2 2 2 


у, ргојџ u; 


Figure 6.3.3 


Step 3. To construct a vector V3 that is orthogonal to both v1 and v2, we compute the component of u3 orthogonal 
to the space Wa spanned by v, and v2 (Figure 6.3.4). Using Formula 11 to perform this computation we 
obtain 

uz, V uz, V 
dun, бача), 
Iv: Il IIvall 


As in Step 2, the linear independence of (u;, uz, ..., uj) ensures that v5 #0. We leave the details for you. 


V3 = U3 — projy, U3 = u3 — 


y, = U, — рој; Ш; 


proj,, и 3 
Figure 6.3.4 


Step 4. To determine a vector v4 that is orthogonal to V1, v2, and уз, we compute the component of u4 orthogonal 
to the space [7з spanned by V1, V2, and уз. From 11, 


: йл, У] u4, V2 u4, УЗ 
уд = U4 — projy, u4 = u4 — Je) - 1942). - 9413), 

+11 II¥all +311 
Continuing in this way we will produce an orthogonal set of vectors (v1, v3,..., Vy} after r steps. Since orthogonal 
sets are linearly independent, this set will be an orthogonal basis for the r-dimensional space W. By normalizing 
these basis vectors we can obtain an orthonormal basis. 


The step-by-step construction of an orthogonal (ог orthonormal) basis given in the foregoing proof is called the 
Gram-Schmidt process. For reference, we provide the following summary of the steps. 


The Gram-Schmidt Process 


To convert a basis (uj, 07, ..., Uy} into an orthogonal basis (v4, v3, ..., vy}, perform the following 


computations: 
Step 1. Y1 5% 
Step 2. u2, У] 
d у= ш – Jr) 
[[у1|| 
Step3. . __ (us, v1) (us, v2) 
y5—u5— 3 Vi 7 Y2 
[[\у1|| [[у2|| 
Step 4. _ (u4. У1) (u4, V2} (04, v3) 
V4—U4— 2 v= 2 Үү = 2 УЗ 
Iv: ll IIvall ПА 


(continue for r steps) 


Optional Step. To convert the orthogonal basis into an orthonormal basis (q1, 92, --., gy} , normalize the 
orthogonal basis vectors. 


EXAMPLE 7 Using the Gram-Schmidt Process + 


Assume that the vector space &? has the Euclidean inner product. Apply the Gram-Schmidt process 
to transform the basis vectors 
u; = (1,1, 1), u2(0,1,1), 3 = (0, 0, 1) 


into an orthogonal basis (v, v2, v3) , and then normalize the orthogonal basis vectors to obtain an 
orthonormal basis (q1, 92, q3} . 


Solution 

Step 1. V1 =U; = (1,1,1) 

Step 2. А uj, У] 
Pw. = u- projy, uz =u2— ды, 


2 
ША 


2 _ {2 11 
@.1,1у-5(1,1,1) = [ 253) 


| 


Step 3. А 13, V] u3, V2 
Р УЗ = аз — pro, из = шз — Ty, _ 19202), v2 
Iul? ғ 

» ~1a1-13/-2,11 

= (000-3000 ee] 

= we 

= (0-23) 
Thus, 


form an orthogonal basis for 22. The norms of these vectors аге 
y6 1 
= V3. vals [зї = 
V2 


so an orthonormal basis for 25 is 


d. Se 


vm ede) 


Remark Inthe last example we normalized at the end to convert the orthogonal basis into an orthonormal basis. 
Alternatively, we could have normalized each orthogonal basis vector as soon as it was obtained, thereby producing 
an orthonormal basis step by step. However, that procedure generally has the disadvantage in hand calculation of 
producing more square roots to manipulate. A more useful variation is to “scale” the orthogonal basis vectors at 
each step to eliminate some of the fractions. For example, after Step 2 above, we could have multiplied by 3 to 
produce ( — 2, 1, 1) as the second orthogonal basis vector, thereby simplifying the calculations in Step 3. 


Erhardt Schmidt (1875—1959) 


Historical Note Schmidt wasa German mathematician who studied for his doctoral degree at Gottingen 
University under David Hilbert, one of the giants of modern mathematics. For most of his life he taught at 
Berlin University where, in addition to making important contributions to many branches of mathematics, 
he fashioned some of Hilbert's ideas into a general concept, called a Hilbert space—a fundamental idea in 


the study of infinite-dimensional vector spaces.He first described the process that bears his name in a paper 
on integral equations that he published in 1907. 
[Image: Archives of the Mathematisches Forschungsinst] 


Jorgen Pederson Germ (1850—1916) 


Historical Note Gram was a Danish actuary whose early education was at village schools 
supplementedby private tutoring. He obtained a doctorate degree in mathematics while working for the 
Hafnia Life Insurance Company, where he specialized in the mathematics of accident insurance.It was in his 
dissertation that his contributions to the Gram-Schmidt process were formulated. He eventually became 
interested in abstract mathematics and received a gold medal from the Royal Danish Society of Sciences 
and Letters in recognition of his work. His lifelong interest in applied mathematics never wavered, however, 
and he produced a variety of treatises on Danish forest management. 

[Image: wikipedia] 


CALCULUS REQUIRED 
EXAMPLE 8 Legendre Polynomials + 


Let the vector space P^ have the inner product 


1 
= / PG) ах 


P. 9 


Apply the Gram-Schmidt process to transform the standard basis { [ee й for P4 into an 


orthogonal basis (64(x), ó3(x), ó3(x)) . 


Solution Take uy = 1, 02 = x, andy, = x^. 
Step 1. Vj = щ = 1 
Step 2. We have 


50 


Дат уу =u 


үз = 0) — 
2 
Ilva ll 


Step 3. We have 


so 
13, V] u5, V7 2 
viu; - "UY, IL 
Ilva | II¥all 
Thus, we have obtained the orthogonal basis (ф(х), 62(x), @3(x)} in which 


é(1(x)-21, ф(х) =х, glx) =x? -i 


zd 
3 


Remark The orthogonal basis vectors in the foregoing example are often scaled so all three functions have a value 
of 1 at x — 1. The resulting polynomials 


1, x, 5(3x?-1) 


which are known as the first three Legendre polynomials, play an important role in a variety of applications. The 
scaling does not affect the orthogonality. 


Extending Orthonormal Sets to Orthonormal Bases 


Recall from part (b) of Theorem 4.5.5 that a linearly independent set in a finite-dimensional vector space can be 
enlarged to a basis by adding appropriate vectors. The following theorem is an analog of that result for orthogonal 
and orthonormal sets in finite-dimensional inner product spaces. 


THEOREM 6.3.6 


If W is a finite-dimensional inner product space, then: 
(a) Every orthogonal set of nonzero vectors in W can be enlarged to an orthogonal basis for W. 


(b) Every orthonormal set in W can be enlarged to an orthonormal basis for W. 


We will prove part (5) and leave part (a) as an exercise. 


Proof (b) Suppose that 5 = (v4, уз, .... v) is an orthonormal set of vectors in W. Part (b) of Theorem 4.5.5 tells 
us that we can enlarge S to some basis 


n 
5 = (vi. Vj. ---+ Уз, Vs+1; T vk) 
for W. If we now apply the Gram-Schmidt process to the set $, then the vectors v, v3, ..., vs, Will not be affected 
since they are already orthonormal, and the resulting set 
n 
S = [vi Vj. ---+ Уз, Vs+1; isi: J vk) 


will be an orthonormal basis for W. 


OPTIONAL 
QR-Decomposition 


In recent years a numerical algorithm based on the Gram-Schmidt process, and known as QR-decomposition, has 
assumed growing importance as the mathematical foundation for a wide variety of numerical algorithms, including 
those for computing eigenvalues of large matrices. The technical aspects of such algorithms are discussed in 
textbooks that specialize in the numerical aspects of linear algebra. However, we will discuss some of the 
underlying ideas here. We begin by posing the following problem. 


Problem 


If A is an jj; x y matrix with linearly independent column vectors, and if О is the matrix that results by 
applying the Gram-Schmidt process to the column vectors of A, what relationship, if any, exists between А 
and Q? 


To solve this problem, suppose that the column vectors of А are uj, uz, ..., uj, and the orthonormal column vectors 
of О are ҷу, 92, --- Чи. Thus, А and О can be written in partitioned form as 


A= [шји2|... jun] and Q= [91 |92... |9] 


It follows from Theorem 6.3.25 that uj, 02, ..., Uy, are expressible in terms of the vectors qj, 42, ..., Чи as 


uj = (щ,91)91 + (шщ, 92392 +5" + (ш, Ҹи) 
ug = (02, 41391 + (02, 92)92 +" + (02, dn}dn 
Un = (Uy, 41)91 + (Ux. qd2)d2 +" + (Us. Ҹи) 


Recalling from Section 1.3 (Example 9) that the jth column vector of amatrix product is a linear combination of the 
column vectors of the first factor with coefficients coming from the jth column of the second factor, it follows that 
these relationships can be expressed in matrix form as 


{41,41} {u2 q1} . (Un q1) 


[а [нә]... аа = Гаал]. as] 17 2. (00242). — Pe 02] 


(ш, d») (u2, an} Zr (us. 4) 


or more briefly as 
A=OR (14) 
where R is the second factor in the product. However, it is a property of the Gram-Schmidt process that for j > 2, 


the vector dj is orthogonal to uj, U3, ..., ч;—]. Thus, all entries below the main diagonal of R are zero, and R has the 
form 


(ui qi} (U2,d1} cc (Um q1) 
R= : ыш, и: 90) (15) 
0 0 б, + (Uy, Gn} 


We leave it for you to show that R is invertible by showing that its diagonal entries are nonzero. Thus, Equation 14 
is a factorization of A into the product of a matrix О with orthonormal column vectors and an invertible upper 
triangular matrix R. We call Equation 14 the QR-decomposition of A. In summary, we have the following theorem. 


THEOREM 6.3.7 QR-Decomposition 


If A is an р x д matrix with linearly independent column vectors, then А can be factored as 
A- QR 


where О is an jj; x у matrix with orthonormal column vectors, and R is an y x » invertible upper triangular 
matrix. 


It is common in numerical linear algebra to say 
that a matrix with linearly independent columns 
has full column rank. 


Recall from Theorem 5.1.6 (the Equivalence Theorem) that a square matrix has linearly independent column 
vectors if and only if it is invertible. Thus, it follows from the foregoing theorem that every invertible matrix has a 
QR-decomposition. 


EXAMPLE 9 QR-Decomposition of a 3 x 3 Matrix — 


Find the QR-decomposition of 


100 
A=/1 1 0 
11 1 
Solution The column vectors of A are 
1 0 0 
щ=|1|, п2=|1 u3 = | 0 
1 1 1 


Applying the Gram-Schmidt process with normalization to these column vectors yields the 


orthonormal vectors (see Example 7) 


41 = 


Eu 


A 
92 = V6 

ES 

V6 
Thus, it follows from Formula 15 that R is 


(ui qi) (02, qi). (us. q1) 


R=| 9 (02,92} (u3, q2} |= 


0 0 (us. q3) 


Show that the matrix Q in Example 9 has 
the property 20 T 1, and show that every 


т х n Matrix with orthonormal column 
vectors has this property. 


from which it follows that the QR-decomposition of A is 


1 2.2. p 
үз у 
pia] [8 /* # 
EE xp 1 
үз yo у 
А = о 


Concept Review 


Orthogonal and orthonormal sets 
Normalizing a vector 
Orthogonal projections 
Gram-Schmidt process 


QR-decomposition 


Skills 


S mw 


УР Sak Se 


* Determine whether a set of vectors is orthogonal (or orthonormal). 
* Compute the coordinates of a vector with respect to an orthogonal (or orthonormal) basis. 
* Find the orthogonal projection of a vector onto a subspace. 


* Use the Gram-Schmidt process to construct an orthogonal (or orthonormal) basis for an inner product 
space. 


* Find ће QR-decomposition of an invertible matrix. 


Exercise Set 6.3 


1. Which of the following sets of vectors are orthogonal with respect to the Euclidean inner product on p 2? 


(a) (0,1), (2,0) 

ef 1 їз. 1 
{2 y2; W2 yo 

(c) Lx - 

(à) (0. 0), (0, 1) 


Answer: 


(a), (b), (d) 


2. Which of the sets in Exercise 1 are orthonormal with respect to the Euclidean inner product on 22? 


3. Which of the following sets of vectors are orthogonal with respect to the Euclidean inner product on R77 
BUDE Wu ккк ee eed oe 
4s SS үзүү үз {2 
[2 _21\{21_2\122 
3' 1р5 23Lr14 3 3 


(c) 1 1 
1, 0, 0), |0, ў ‚ (0,0,1 
(1, › | d ( ) 


От шал 8 
Ve ye PEW yo 

Answer: 

(Ы), (d) 


4. Which of the sets in Exercise 3 are orthonormal with respect to the Euclidean inner product on 23? 


5. Which of the following sets of polynomials are orthonormal with respect to the inner product on P5 discussed in 
Example 7 of Section 6.1 ? 


a 4 A23 1.1 m EB mm à p 2,2 
(а) рух) 3 gt + 3x, pi) $03 Sx", Рз(х) +З ox 


©) piG) = 1, pale) = ея + ара 


Answer: 


(a) 
‚ Which of the following sets of matrices are orthonormal with respect to the inner product on 2; discussed in 
Example 6 of Section 6.1 ? 


(a) 2 2 1 
1 0 p 3 j 3 9 3 
op Hun [ox we а 9 

3 3 $5 3 3 


ERE HE 


. Verify that the given vectors form an orthogonal set with respect to the Euclidean inner product; then convert it 
to an orthonormal set by normalizing the vectors. 


(a) (— 1, 2), (6, 3) 
(b) (1,0, = 1), (2, 0, 2), (0, 5, 0) 


© (L2 (-ii90i-i) 


l o, L.| (0,1,0 
Jr и 


(с) al bolo wp opu c +2. 
ERB {2 Ja у We Ve y6 

. Verify that the set of vectors {(1, 0), (0, 1) } is orthogonal with respect to the inner product 

(u, v = 4u1v1 + муз on д2; then convert it to an orthonormal set by normalizing the vectors. 
. Verify that the vectors 

278 —{4 3 = 
“=| $ $0) v2= ($. 2.0) v3= @, 0,1) 

form an orthonormal basis for R? with the Euclidean inner product; then use Theorem 6.3.2b to express each of 

the following as linear combinations of ¥1, V2, and уз. 

(a) (1, —1, 2) 

(b) (3, — 7,4) 


o (4) 


Answer: 


(а) = -+ iv =+ 2v3 


® — iv – v+ 4v3 


(с) =y el s 
vi 72 1 3Y3 


10. Verify that the vectors 
vi —(1, 21,2, = 1), v2—( — 2,2, 5, 2), 
v3— (1,2, 0, = 1), v4 — (1, 0, 0, 1) 
form an orthogonal basis for g^ with the Euclidean inner product; then use Theorem 6.3.2a to express each of 


the following as linear combinations of v4, уз, уз, and V4. 


60-051, 1, 1) 


(b) (V2. — 372, 5/2, - y2) 


Gr [—4 2-14 
3 3 24 


H. (a) Show that the vectors 


vy, = (1 22,5, -, v; = (2,1, 4, — 3), 
vi = (—3,4,1, -2), v4 = (4,3,2,1) 
form an orthogonal basis for 24 with the Euclidean inner product. 


(b) Use Theorem 6.3.2a to express u = ( = 1, 2, 3, 7) asa linear combination of the vectors in part (a). 


Answer: 


(b) а= -in- 1. + буз + AL 


In Exercises 12-13, an orthonormal basis with respect to the Euclidean inner product is given. Use Theorem 6.3.2b 
to find the coordinate vector of w with respect to that basis. 


12. 
© 6 (3,7); uj = " +5} u= > | 


(b) w-(-102:u-(2. - 2. 1)u i-i 
13. (a) ay 212 шр ee = CA eee 
Моа (s. 3 2) m- (2,2, з= (2. | J 
zi 


3 

(b) 3 1 1 1 2 

«Chua [Rr w= |-—, - = 
Ay) 


(b) w=- + tha, 


V6 уб 


In Exercises 14—15, the given vectors are orthogonal with respect to the Euclidean inner product. Find ргоју x, 
where x = (1, 2, 0, — 2) and Wis the subspace of g^ spanned by the vectors. 


14. (а) уу = (1,1, 1, 1), з= (1,1, 2-1, 2) 
(b) vj = (0, 1, —4, = 1), v2 = (3, 5, 1, 1) 


15. (а) уу = (1, 1, 1, 1), жо = (1,1, 21, =- 1), va— (1, 21,1, — 1) 
(b) vj = (0, 1, – 4, = 1), v9 = (3, 5, 1, D, v3 = (1, 0, 1, – 4) 


Answer: 


^ 4.-$-1 


(b) Es 7 1 A 


Boa а 715 


In Exercises 16-17, the given vectors are orthonormal with respect to the Euclidean inner product. Use Theorem 
6.3.4b to find projg x, where x = (1, 2, 0, = 1) and W is the subspace of в spanned by the vectors. 


18. In Example 6 of Section 4.9 we found the orthogonal projection of the vector x = (1, 5) onto the line through 
the origin making an angle of з / 6 radians with the x-axis. Solve that same problem using Theorem 6.3.4. 


19. Find the vectors w; in W and w2 in 7 ^ such that x = w4 + wz, where x and W are as given in 
(a) Exercise 14(a). 
(b) Exercise 15(a). 


Answer: 


Ome “ie cibi - 

Wy [25 1 1 |, we 5 z> l 1 

b) a232 3 3 >f 2 33 3 
W| pe 4° 4 > W2 4'4'4 4 


20. Find the vectors w1 in W and wz in Jy + such that x = wy + w2, where x and W are as given in 
(a) Exercise 16(a). 
(b) Exercise 17(а). 
21. Let д2 have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis (uj, uz} into 
an orthonormal basis. Draw both sets of basis vectors in the xy-plane. 
(a) ui — (1, — 3), ш = (2, 2) 
(p) ш = (1, 0), ш = (3, —5) 


Answer: 


— 
e 


ic) 


(b) Yi = (1,0), v7 (0, —1) 


22. Let R? have the Euclidean inner product. Use theGram-Schmidt process to transform the basis (uj, uz, шз) 
into an orthonormal basis. 


(a) uy = (1, 1, 1,u;—(—1,1,0),u3— (1, 2, 1) 
(b) uy = (1, 0, 0), ug = (3, 7, — 2), 03 = (0, 4, 1) 


23. Let g^ have the Euclidean inner product. Use the Gram-Schmidt process to transform the basis 
(uj, u5, U3, U4} into an orthonormal basis. 


24. 


25. 


26. 


27. 


28. 


29. 


u; = (0, 2,1,0), а= (1, —1,0,0), 
uz = (1,2,0, 21), u4= (1,0,0, 1) 


Answer: 


Let 23 have the Euclidean inner product. Find an orthonormal basis for the subspace spanned by (0, 1, 2), 
(—1,0,1,(-— 1,1, 3). 
Let 23 have the inner product 
(а, v) = u1v1 + 2u3v2 + 3u3v3 
Use the Gram-Schmidt process to transform uy = (1, 1, 1), uz = (1, 1, 0), из = (1, 0, 0) into an orthonormal 


basis. 


Answer: 


ср ине аи КИ а кн тани ы КЕЛЕН БИЕ ӨЙҮ 

ye Ve Ve ye Ve y6 ye ү 
Let R? have the Euclidean inner product. The subspace of R? spanned by the vectors uy = E ‚= 2) апа 
uz = (0, 1, 0) is aplane passing through the origin. Express w= (1, 2, 3) in the form w= wy, + W2, where Wj 
lies in the plane and w? is perpendicular to the plane. 


Repeat Exercise 26 with uj = (1, 1, 1) and uz = (2, 0, — 1). 
Answer: 


zi 31 40V oc qs Е З 
« = (14. 14° u) w= [ir 14° a) 


Let д4 have the Euclidean inner product. Express the vector w= ( — 1, 2, 6, 0) in the form w= wy + wo, 
where №] is in the space W spanned by uj = ( — 1, 0, 1, 2) and uz = (0, 1, 0, 1), and жо is orthogonal to W. 


Find the (QR-decomposition of the matrix, where possible. 


(а) |1 -1 
3 


т=н qu рта: edi 


Answer: 


is а -e 
Ia de SEG Qu Ies © 


с Eu e (ст o o 
7—1 са ү. Ie o o к _—_— 
ee ш Rp ее 


5 ^ e «е. T 


METRE 


“© ро = 4s 4s 484848 -©-е-е © sis ae 


{ече ode е ч iam dee de 42 S = 


а а) ee к... _——————- 


(f) Columns not linearly independent 


30. 


31. 
32. 


33. 


34. 


In Step 3 of the proof of Theorem 6.3.5, it was stated that “the linear independence of (uj, u5,..., чь} ensures 
that v4 #0.” Prove this statement. 


Prove that the diagonal entries of R in Formula 15 are nonzero. 


Calculus required Use Theorem 6.3.2a to express the following polynomials as linear combinations of the first 
three Legendre polynomials (see the Remark following Example 8). 


(а) 1-- x 4- 4x? 
(b) 2—7? 
(c) 4+ 3x 


Calculus required Let P have the inner product 


1 
р.а) = / p(x)q(x) dx 


Apply the Gram-Schmidt process to transform the standard basis ® = fi, X, X ? into an orthonormal basis. 


Answer: 


vi — 1, v; — ү3(2х — 1), уз = 4 5(6x? — 6x + 1) 
Find vectors x and y in д2 that are orthonormal with respect to the inner product (о, v) = Зи уу + 2u2v2 but 


are not orthonormal with respect to the Euclidean inner product. 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 


(a) Every linearly independent set of vectors in an inner product space is orthogonal. 


Answer: 


False 


(b) Every orthogonal set of vectors in an inner product space is linearly independent. 


Answer: 


False 


(c) Every nontrivial subspace of 23 has an orthonormal basis with respect to the Euclidean inner product. 


Answer: 


True 


(d) Every nonzero finite-dimensional inner product space has an orthonormal basis. 


Answer: 


True 


(e) proj x is orthogonal to every vector of W. 


Answer: 


False 


(f) If A is an » x y matrix with a nonzero determinant, then А has a OR-decomposition. 
Answer: 


True 


Copyright (O 2010 John Wiley & Sons, Inc. All rights reserved. 


6.4 Best Approximation; Least Squares 


In this section we will be concerned with linear systems that cannot be solved exactly and for which an approximate solution is 
needed. Such systems commonly occur in applications where measurement errors “perturb” the coefficients of a consistent system 
sufficiently to produce inconsistency. 


Least Squares Solutions of Linear Systems 


Suppose that 4x = h is an inconsistent linear system of m equations in n unknowns in which we suspect the inconsistency to be 
caused by measurement errors in the coefficients of А. Since no exact solution is possible, we will look for a vector x that comes as 
“close as possible" to being a solution in the sense that it minimizes ||b — Ax|| with respect to the Euclidean inner product on А". 
You can think of 4x as an approximation to b and ||b — Ax|| as the error in that approximation—the smaller the error, the better 
the approximation. This leads to the following problem. 


Least Squares Problem 


Given a linear system 4x = h of m equations іп n unknowns, find a vector x that minimizes ||Ь — Ax|| with respect to the 
Euclidean inner product on 2™. We call such an x a least squares solution of the system, we call  — Ах the least squares 
error vector, and we call ||b — 4x|| the least squares error. 


To clarify the above terminology, suppose that the matrix form of þh — Ax is 


e1 
e 
ь- Ак= |2 
ет 
The term “least squares solution" results from the fact that minimizing ||b — Ax|| also minimizes ||b — Ax||? = еў | г2 Р... eż 


Best Approximation 


Suppose that b is a fixed vector in 23 that we would like to approximate by a vector w that is required to lie in some subspace W 
of ВЗ. Unless b happens to be in W, then any such approximation will result in an “error vector”  — w that cannot be made equal 
to 0 no matter how w is chosen (Figure 6.4.1a). However, by choosing 
w= proja b 
we can make the length of the error vector 
ПЬ — wl| = ||b — proj bll 
as small as possible (Figure 6.4.1b). 
Р 


b b — proj b 


Proj gb Q 


(a) (b) 
Figure 6.4.1 


These geometric ideas suggest the following general theorem. 


THEOREM 6.4.1 Best Approximation Theorem 


If W is a finite-dimensional subspace of an inner product space V, and if b is a vector in V, then proj b is the best 
approximation to b from W in the sense that 
llb — prog bl] < llb — wil 


for every vector w in W that is different from рго]ду b. 


Proof For every vector w in W, we can write 


b—w= (b = ргојру b) + (projp b = w) (1) 
But projyr b — w being a difference of vectors in Wis itself in W; and since b — рго]ду b is orthogonal to W, the two terms on the 
right side of 1 are orthogonal. Thus, it follows from the Theorem of Pythagoras (Theorem 6.2.3) that 
2 ; 2 ; 2 
llb — || = llb — proj bll^ + |lprojgr b — wll 
Since w # proja b, it follows that the second term in this sum is positive, and hence that 
; 2 2 2 
llb — proj bll < ||b — wl 
Since norms are nonnegative, it follows (from a property of inequalities) that 
llb — proj Ъ|| « llb — wil 


Least Squares Solutions of Linear Systems 


One way to find a least squares solution of Ах = h is to calculate the orthogonal projection proj b on the column space W of the 
matrix А and then solve the equation 


Ax = projg b (2) 
However, we can avoid the need to calculate the projection by rewriting 2 as 


b — Ax =b — ргоју b 
and then multiplying both sides of this equation by 47 to obtain 


Al (bh — Ax) = AT (b — ргоју b) (3) 
Since b — projg b is the component of b that is orthogonal to the column space of А, it follows from Theorem 4.8.95 that this 
vector lies in the null space of 47, and hence that 
AT (b — ргоју b) =0 
Thus, 3 simplifies to 


AT (b — Ax) = 0 


which we can rewrite as 


AT Ах = AT) (4) 


This is called the normal equation or the normal system associated with 4x — h. When viewed as a linear system, the individual 
equations are called the normal equations associated with Ах = b. 


In summary, we have established the following result. 


THEOREM 6.4.2 
For every linear system 4x = h, the associated normal system 
AT Ах = АЉ (5) 


is consistent, and all solutions of 5 are least squaressolutions of Ах = h. Moreover, if W is the column space of A, and x is 
any least squares solution of 4x = h, then the orthogonal projection of b on W is 


projyr b = Ах (6) 


If a linear system is consistent, then its exact solutions are 
the same as its least squares solutions, in which case the 
error is Zero. 


EXAMPLE 1 Least Squares Solution <4 


(a) Find all least squares solutions of the linear system 


xy, = x3 = 4 
Зх + 2x2 = 1 
—2x| EF 4x3 = 3 


(b) Find the error vector and the error. 


Solution 


(a) It will be convenient to express the system in the matrix form 4x = b, where 


1 -1 4 
А=| 3 2| and b=/1 
—2 4 3 
It follows that 
T 1 $.-3 var. 14 —3 
Lipi 
E» Alle, a -3 21 


so the normal system 47 4y = АТЫ is 


14 -3]|*1| | 1 
=3 21 |12 10 
Solving this system yields a unique least squares solution, namely, 


y _ 143 


*1— 95° 727—585 


(b) The error vector is 


...82 1232 
Apu us 22 
шла bn Ue мз |^\1|7| 285 || 295 
=^ #25 95 4 
57 3 
and the error is 
||b — Ax|| = 4.556 


EXAMPLE 2 Orthogonal Projection on a Subspace <4 


Find the orthogonal projection of the vector u = ( — 3, — 3, 8, 9) on the subspace of 24 spanned by the vectors 
uy = (3,1,0, 1), ш= (1,2,1, 1), 03=(- 1,0, 2, —1) 


Solution We could solve this problem by first using the Gram-Schmidt process to convert (uj, из, u3} into an 
orthonormal basis and then applying the method used in Example 6 of Section 6.3 . However, the following method 
is more efficient. 


The subspace W of RÍ spanned by щу, u5, and U3 is the column space of the matrix 


3 d =l 
12 0 
= |00 2 
1 —1 


Thus, if u is expressed as a column vector, we can find the orthogonal projection of u on W by finding a least 
squares solution of the system Ах = u and then calculating ргој u = Ax from the least squares solution. The 
computations are as follows: The system 4x = u is 


$51] -3 
Xi 

12 0[.|- 

01 2 8 

1 ob = 9 

50 

310 13 » 11 6 =4 
АТА =| 121 1/0 j|7| 67 0 
-1 02 -1|1, _{ -40 6 
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Il 
ww € 
MN) OC 
Ne © 
= e 
Ем 
КА 
ы UJ 
a | best к н 
Il 
| 
со UJ 


11 6 -4 [zi 
6 7 0| 2x2 


Solving this system yields 


as the least squares solution of Ах — y (verify), so 


рго}у u = Ax = 


or, in comma-delimited notation, projg; u = ( — 2, 3, 4, 0). 


Uniqueness of Least Squares Solutions 


In general, least squares solutions of linear systems are not unique. Although the linear system in Example 1 turned out to have a 
unique least squares solution, that occurred only because the coefficient matrix of the system happened to satisfy certain conditions 
that guarantee uniqueness. Our next theorem will show what those conditions are. 


THEOREM 6.4.3 


If A is апу; x y matrix, then the following are equivalent. 
(a) A has linearly independent column vectors. 


(b) AT Ais invertible. 


Proof We will prove that (a) => (5) and leave the proof that (5) => (a) as an exercise. 


(a) = (b) Assume that A has linearly independent column vectors. The matrix 47 А has size у; x ә, so we can prove that this 
matrix is invertible by showing that the linear system 47 4x — 0 has only the trivial solution. But if x is any solution of this 
system, then 4x is in the null space of 47 and also in the column space of А. By Theorem 4.8.95 these spaces are orthogonal 
complements, so part (b) of Theorem 6.2.4 implies that 4x = Q. But А is assumed to have linearly independent column vectors, so 
x = Q by Theorem 1.3.1. 


As an exercise, try using Formula 7 to solve the problem 
in part (a) of Example 1. 


The next theorem, which follows directly from Theorem 6.4.2 and Theorem 6.4.3, gives an explicit formula for the least squares 
solution of a linear system in which the coefficient matrix has linearly independent column vectors. 


THEOREM 6.4.4 


If A is an з x y matrix with linearly independent column vectors, then for every р x | matrix b, the linearsystem 4x = b 
has a unique least squares solution. This solution is given by 


х= (aTa) 47b (7) 


Moreover, if W is the column space of A, then the orthogonalprojection of b on W is 


-1 
proj b = Ах = ААТА) АТ (8) 


OPTIONAL 
The Role of QR-Decomposition in Least Squares Problems 


Formulas 7 and 8 have theoretical use but are not well suited for numerical computation. In practice, least squares solutions of 
Ax = h are typically found by using some variation of Gaussian elimination to solve the normal equations or by using 


QR-decomposition and the following theorem. 


THEOREM 6.4.5 


If A is an р x y matrix with linearly independent column vectors, and if А = QR is а QR-decomposition of A (see Theorem 
6.3.7), then for each b in R™ the system 4x = h has a unique least squares solution given by 


х= gb (9) 


A proof of this theorem and a discussion of its use can be found in many books on numerical methods of linear algebra. However, 
you can obtain Formula 9 by making the substitution A = QR in 7 and using the fact that oTo = / to obtain 


x = (OTOR) (OR) 


(ROTER) (oR) "h 
_ po (a7) ктот 


Orthogonal Projections on Subspaces of R” 


In Section 4.8 we showed how to compute orthogonal projections on the coordinate axes of a rectangular coordinate system in R3 
and more generally on lines through the origin of 23. We will now consider the problem of finding orthogonal projections on 


subspaces of R™. We begin with the following definition. 


DEFINITION 1 


If W is a subspace of 2”, then the linear transformation P: R™ — Wy that maps each vector x in R™ into its orthogonal 


projection projyr x in W is called the orthogonal projection of R" on W 


It follows from Formula 7 that the standard matrix for the transformation P is 


- 
[Р] = (474) АТ (10) 
where А is constructed using any basis for W as its column vectors. 


EXAMPLE 3 The Standard Matrix for an Orthogonal Projection on a Line — 


We showed in Formula 16 of Section 4.9 that 
2 А 
Pj= cos ð ш m 0 
sm@cos@ sin“ 


is the standard matrix for the orthogonal projection on the line W through the origin of R? that makes an angle Ө with 
the positive x-axis. Derive this result using Formula 10. 


Solution The column vectors of A can be formed from any basis for W. Since W is one-dimensional, we can take 
w= (cos Í, sin 0) as the basis vector (Figure 6.4.2), so 


cos й 
А= |. 
| sin б | 
We leave it for you to show that 47 4 is the ] x ] identity matrix. Thus, Formula 10 simplifies to 


cos B 


[Р] = 4(474) a = aa? =| сео sin 0] 
sin б 


А 22 8 
sin cos B sin” б 


соѕ20 sin É cos ! =p 


cos Ө 


Figure 6.4.2 


Another View of Least Squares 


Recall from Theorem 4.8.9 that the null space and row space of an р; x »; matrix A are orthogonal complements, as are the null 
space of 47 and the column space of A. Thus, given a linear system 4x — h in which A is an jj; x » matrix, the Projection 


Theorem (6.3.3) tells us that the vectors x and b can each be decomposed into sums of orthogonal terms as 
X= Xr A HFXni and b= Pruna”) + beaks) 


where Xrow(.4j and Хш А) are the orthogonal projections of x on the row space of A and the null space of A, and the vectors 
л) and b сол are the orthogonal projections of b оп the null space of 4 T and the column space of A. 


In Figure 6.4.3 we have represented the fundamental spaces of А by perpendicular lines in &" and R™ on which we indicated the 
orthogonal projections of x and b. (This, of course, is only pictorial since the fundamental spaces need not be one-dimensional.) 
The figure shows Ах as a point in the column space of А and conveys that b coi ‘A is the point in col(A) that is closest to b. This 


illustrates that the least squares solutions of 4x — } are the exact solutions of the equation Ax = b co A. 


null(A) col(A) 


Xnulk(A) booka) 


nul(A”) 


Быша Т, 


Figure 6.4.3 


More on the Equivalence Theorem 


As our final result in the main part of this section we will add one additional part to Theorem 5.1.6. 


THEOREM 6.4.6 Equivalent Statements 


If A is an » x » matrix, then the following statements are equivalent. 
(a) А 15 invertible. 

(b) Ax = 0 has only the trivial solution. 

(c) The reduced row echelon form of A is 7,,. 

(d) Ais expressible as a product of elementary matrices. 

(e) Ax = is consistent for every » x | matrix b. 

(f) Ах=Ъ has exactly one solution for every » x | matrix b. 
(0) detí A) #0. 

(h) The column vectors of А are linearly independent. 

(i) The row vectors of A are linearly independent. 

(1) The column vectors of А span R”. 

(k) The row vectors of A span R”. 

(1) The column vectors of A form a basis for R”. 

(m) The row vectors of A form a basis for R”. 

(n) A has rank». 

(o) A has nullity 0. 

(p) The orthogonal complement of the null space of А is R”. 
(q) The orthogonal complement of the row space of A is {0}. 
(r) The range of T gis R”. 

(s) T gis one-to-one. 

(1) \=Ois not an eigenvalue of A. 


(u) A? Ais invertible. 


The proof of part (и) follows from part (Л) of this theorem and Theorem 6.4.3 applied to square matrices. 


OPTIONAL 


We now have all the ingredients needed to prove Theorem 6.3.3 in the special case where V is the vector space R™”. 


Proof of Theorem 6.3.3 We will leave the case where W = {0} as an exercise, so assume that W (0) . Let 
(v1, V2, --„ Vk) be any basis for W, and form the jj x & matrix M that has these basis vectors as successive columns. This makes 
W the column space of M and hence pF ^ the null space of 34 7. We will complete the proof by showing that every vector u in 2?! 


can be written in exactly one way as 


u—w| + W2 
where №] is in the column space of M and y Twa — 9. However, to say that w4 is in the column space of M is equivalent to saying 
w1 = Mx for some vector x in R™, and to say that Ml wy = Q is equivalent to saying that Af Tiu — зг) = 0. Thus, if we can 
show that the equation 


Ml (u— Mx) =0 (11) 


has a unique solution for x, then w4 = Лх and №2 = X — №] will be uniquely determined vectors with the required properties. To 
do this, let us rewrite 11 as 


МТМк= Ма 
Since the matrix M has linearly independent column vectors, ће matrix ag? M is invertible by Theorem 6.4.6 and hence the 
equation has a unique solution as required to complete the proof. 


Concept Review 

* Least squares problem 

* Least squares solution 

* Least squares error vector 
* Least squares error 

* Best approximation 

* Normal equation 


* Orthogonal projection 
Skills 


* Find the least squares solution of a linear system. 
* Find the error and error vector associated with a least squares solution to a linear system. 
* Use the techniques developed in this section to compute orthogonal projections. 


* Find the standard matrix of an orthogonal projection. 


Exercise Set 6.4 


1. Find the normal system associated with the given linear system. 


(а) |1 —1 xl 
2 3 [z]- —1 


(b) 2-1 0 xi =] 
3 12 = 0 
2 == 
—1 4 5 x3 1 
24 2 
Answer: 


(а) [21 25][x1]. [20 
25 35||*2] |20 
[15 -1 5 Н -1 


=1 22 30ļ||72|=| 9 
5 30 45 || 43 13 


In Exercises 2—4, find the least squares solution of the linear equation 4x = hb. 


2. (а) 1 -1 
=|2 3jí;b-|-1 
4 5 
(b) 2 —2 
А= |1 15;b-2|-1 
3 
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(b) 1 0 -1 
2 1 -2 
s 11 0 
11 -1 
Answer: 
(a) x4—5, m= 
(b) х1 = 12, x32 —3, x3=9 
4. (a) 3 2-1 2 
=| 1 -4 3[hbz|-2 
1 10 -7 1 
(b) 2 0 —1 0 
1-2 2 6 
== b= 
a 2-1 0 0 
0 1 -1 6 


In Exercises 5—6, find the least squares error vector e = h — Ах resulting from the least squares solution x and verify that it is 
orthogonal to the column space of A. 


5. (а) A and b are as in Exercise 3(a). 
(b) А and b are as in Exercise 3(b). 


Answer: 


m 
ll 
Bolo polos 


I 
UJ © ш ш ы 


"m 
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6. (a) A and b are as in Exercise 4(a). 
(b) А and b are as in Exercise 4(b). 


7. Find all least squares solutions of Ах = h andconfirm that all of the solutions have the same error vector. Compute the least 
squares error. 


(a) 2 1 3 
A=| 4 2 [;Ь= |2 
—2 1 1 
(b) 1-3 1 
А=|—2 —6|;һ=|0 
з 9 1 
(с) -1 3 2 7 
А=| 2 1 3|Ь=| 0 
011 -7 


Answer: 


(a) Solution: x = A $ $ least Squares error: A5 
10’ 5 5 
(b) Solution: x — " o ++ £(—3, 1) (t a real number); least squares error: i442 
(c) Solution: x = e z, 0) --£(—1, — 1, 1) (ta real number); least squares error: i 294 


8. Find the orthogonal projection of u on the subspace of R? spanned by the vectors у! and V2. 
(a) u=(2,1,3); у= (1,1,0), we=(1, 2, 1) 
(b u= (1, – 6, 1); м=(- 1, 2, 1), ж = (2, 2, 4) 


9. Find the orthogonal projection of u on the subspace of 74 spanned by the vectors у], V2, and v3. 


(à u= (6, 3, 9, 6); v1 = (2, 1, 1,1, то = (1,0, 1, 1), va- (= 2, = 1,0, – 1) 
(b u=(—2, 0, 2, 4); у —(1,1,2,0,v22 (72, 21, 22,1, 3-2 (73, 21,1,3) 


Answer: 

(а) (7, 2,9, 5) 

(6) (12 _4 12 16 
= 357 25 


10. Find the orthogonal projection of u — (5, 6, 7, 2) on the solution space of the homogeneous linear system 
ХФ X24- x3 =0 
2х2+хз+хд=0 


И. In each part, find det (4 A), and apply Theorem 6.4.3 to determine whether A has linearly independent column vectors. 


12. 


13. 


1 


A 


15. 


(a) =] 3 2 
A= 2 1.3 
0 1 1 
(b) 2 =] 3 
0 1 1 
А= 
—1 0 —2 
4 —5 3 
Answer: 


(а) det(A ТА) = 0; A does not have linearly independent column vectors. 


(b) det (A ТА) = 0, A does not have linearly independent column vectors. 


Use Formula 10 and the method of Example 3 to find the standard matrix for the orthogonal projection р. 22 _, 52 onto 
(a) the x-axis. 

(b) the y-axis. 

[Note: Compare your results to Table 3 of Section 4.9.] 

Use Formula 10 and the method of Example 3 to find the standard matrix for the orthogonal projection P: R? — R? onto 
(a) the xz-plane. 

(b) the yz-plane. 

[Note: Compare your results to Table 4 of Section 4.9.] 


Answer: 
(a) 100 
[P]=|0 0 0 
00 1 
(b) 000 
[P]=|0 10 
001 
. Show that if w= (a, b, c) is a nonzero vector, then the standard matrix for the orthogonal projection of 2? on the line 
span {w} is 
a? ab ac 
P=— llab b? be 
а?+„Ь?+„с? 2 
ac be c 


Let W be the plane with equation 5x — 3y +z = 0. 

(a) Find a basis for W. 

(b) Use Formula 10 to find the standard matrix for the orthogonal projection on W. 

(c) Use the matrix obtained in part (b) to find the orthogonal projection of a point Р(х, yg, 20) on W. 

(d) Find the distance between the point Рџ(1, — 2, 4) and the plane W, and check your result using Theorem 3.3.4. 


Answer: 


(a) (1,0, —5), (0, 1,3) 
(b) 10° 15 —5 
[Р] = 35 15 26 3 
-5 3 34 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


(с) (= F3yg—zg 15хр + 26ур + Зр 5х0 + Зур + 2420 | 
—— ит —— 


35 35 
(d) зү 35 
7 


Let W be the line with parametric equations 
x — 2t, —-—í, z=4 
(a) Find a basis for W. 
(b) Use Formula 10 to find the standard matrix for the orthogonal projection on W. 
(c) Use the matrix obtained in part (b) to find the orthogonalprojection of a point Py (xg, yg, 20) on W. 
(d) Find the distance between the point Pg (2, 1, — 3) and the line W. 


In RÊ, consider the line / given by the equations 


and the line m given by the equations 

x=s, y-—2s—-l, z=1 
Let P be a point on /, and let О be a point on m. Find the values of t and s that minimize the distance between the lines by 
minimizing the squared distance ||P — Q|| a 


Answer: 


s=t=1 
Prove: If A has linearly independent column vectors, and if 4x = } is consistent, then the least squares solution of 4x = h and 
the exact solution of 4x — } are the same. 


Prove: If A has linearly independent column vectors, and if b is orthogonal to the column space of A, then the least squares 
solution of 4x =h is x — 0. 

Let P: R" — W be the orthogonal projection of R™ onto a subspace W. 

(a) Prove that [2] 2 [2]. 

(b) What does the result in part (a) imply about the composition P o P? 

(c) Show that [P] is symmetric. 


Let A be an jg x з matrix with linearly independent row vectors. Find a standard matrix for the orthogonal projection of R” 
onto the row space of A. [Hint: Start with Formula 10.] 


Answer: 


[P] = 47 (A47) E 


Prove the implication (2) => (a) of Theorem 6.4.3. 


True-False Exercises 


In parts (a)-(h) determine whether the statement is true or false, and justify your answer. 


a) If A is an matrix, then 47 4 is a square matrix. 
xn q 


Answer: 


True 


(b) If 47 4 is invertible, then А is invertible. 


Answer: 


False 


(c) If A is invertible, then 47 4 is invertible. 


Answer: 


True 


(d) If 4x = Ъ is a consistent linear system, then 47 4x = АТ} is also consistent. 
Answer: 


True 


(е) If Ах = Ь is an inconsistent linear system, then 47 4y = 47] is also inconsistent. 


Answer: 


False 


(f) Every linear system has a least squares solution. 
Answer: 


True 


(g) Every linear system has a unique least squares solution. 
Answer: 


False 


(h) If A is an jj; x у matrix with linearly independent columns and b is іп R"*, then Ах — h has a unique least squares solution. 
Answer: 


True 
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6.5 Least Squares Fitting to Data 


In this section we will use results about orthogonal projections in inner product spaces to obtain a technique 
for fitting a line or other polynomial curve to a set of experimentally determined points in the plane. 


Fitting a Curve to Data 


А common problem in experimental work is to obtain a mathematical relationship y — 7 (x) between two 
variables x and y by “fitting” a curve to points in the plane corresponding to various experimentally 
determined values of x and y, say 


(x1, 1). (Х2, У2),.-- (ху, Ум) 


On the basis of theoretical considerations or simply by observing the pattern of the points, the experimenter 
decides on the general form of the curve y — 7 (x) to be fitted. Some possibilities are (Figure 6.5.1) 


(a) A straight line: y = a + bx 
(b) A quadratic polynomial: y = a + bx 4 сх? 
(c) A cubic polynomial: у = а + bx 4 ex? + dx? 


Because the points are obtained experimentally, there is often some measurement "error" in the data, making 
it impossible to find a curve of the desired form that passes through all the points. Thus, the idea is to choose 
the curve (by determining its coefficients) that “best” fits the data. We begin with the simplest and most 
common case: fitting a straight line to data points. 


y y 


> 


(a) y=a+ bx (b) ys a4 bx c? (c) ys a bx + cx + d? 


Figure 6.5.1 


Least Squares Fit of a Straight Line 


Suppose we want to fit a straight line у = æ + bx to the experimentally determined points 


(x1. y1) (x2, у2), sees (ху, Ум) 
If the data points were collinear, the line would pass through all и points, and ће unknown coefficients a and 
b would satisfy the equations 


y2 = а + bX 
Yn = a + OX, 
We can write this system in matrix form as 
1 x УІ 
1 х2 [4] |¥2 
ре |: 
| wu Ум 
ог тоге сотрасіу аѕ 
Мұ =у (1) 
where 
УІ 1 х 
У2 Ix а 
ys| |. Ma]. "2| "= [5] (2) 
Yn ae 


If the data points are not collinear, then it is impossible to find coefficients a and b that satisfy system 1 
exactly; that is, the system is inconsistent. In this case we will look for a least squares solution 


* a 
v—v = а 


We call a line y =a + Ё x whose coefficients соте from a least squares solution a regression line or a 


least squares straight line fit to the data. To explain this terminology, recall that a least squares solution of 1 
minimizes 


lly — M vll (3) 

If we express the square of 3 in terms of components, we obtain 
2 2 2 2 4 
lly — М = 11 — a — 8x1)" + y2-7a— 5x2) +... + (ула рх) (4) 


If we now let 


й = |р =a Ф| 42 = |уз-а – хз), --- dy = |y,» —a — ёх 


then 4 can be written as 
ly — Myl? =d? +d} +...+а] (5) 


As illustrated in Figure 6.5.2, the number @; can be interpreted as the vertical distance between the line 
y = а + bx and the data point (x,, у;). This distance is a measure of the “error” at the point (x,, у;) 


resulting from the inexact fit of y = a + bx to the data points, the assumption being that the x; are known 
exactly and that all the error is in the measurement of the y;. Since 3 and 5 are minimized by the same vector 
v , the least squares straight line fit minimizes the sum of the squares of the estimated errors d j, hence the 


name least squares straight line fit. 


Figure 6.5.2 d, measures the vertical error in the least squares straight line. 


Normal Equations 


Recall from Theorem 6.4.2 that the least squares solutions of 1 can be obtained by solving the associated 
normal system 
MTMv— МТу 


the equations of which are called the normal equations. 


In the exercises it will be shown that the column vectors of M are linearly independent if and only if the л data 
points do not lie on a vertical line in the xy-plane. In this case it follows from Theorem 6.4.4 that the least 
squares solution is unique and is given by 


* 


v= (м?м) MTy 


In summary, we have the following theorem. 


THEOREM 6.5.1 Uniqueness of the Least Squares Solution 


Let (x1, у), (X2, Y2), .... (Ху, Ум) bea set of two or more data points, not all lying on a vertical 
line, and let 


l х УІ 
М= |1 *2| and y=|”? 
1 " Yn 
Then there is a unique least squares straight line fit 
y=a “4b'x 


to the data points. Moreover, 


is given by the formula 
m 
v= (м?м) МТу (6) 


which expresses the fact that y — y” is the unique solution of the normal equations 


MTMv— My (7) 


EXAMPLE 1 Least Squares Straight Line Fit — 


Find the least squares straight line fit to the four points (0, 1), (1, 3), (2, 4), and (3, 4). (See 
Figure 6.5.3.) 


Solution We have 


10 
1 1| ry, [4 6 T if 7 =3 
M = MTM = а(м7м\ = 
12} |: | (нм) ae z| 
1 3 
1 
* PB =3 || 1 1 14/3 1.5 
у (м?м) и»= th] || 3]|4| | 4 
4 


so the desired line is y = 1.5 + x. 


EXAMPLE 2 Spring Constant + 


Hooke's law in physics states that the length x of a uniform spring is a linear function of the 
force y applied to it. If we express this relationship as y = g + bx, then the coefficient b is 
called the spring constant. Suppose a particular unstretched spring has a measured length of 6.1 
inches (1.е., x = 6.1 when у = 0). Forces of 2 pounds, 4 pounds, and 6 pounds are then applied 
to the spring, and the corresponding lengths are found to be 7.6 inches, 8.7 inches, and 10.4 
inches (see Figure 6.5.4). Find the spring constant. 


x 


Force) 
Figure 6.5.4 
Solution We have 
1 6.1 0 
1 7.6 2 
М = == 

i 87 olal 
1 104 6 


апа 


Ы =1 
* а T T —8.6 
v= — (M M) Mly x 
| b : | у | | 
where the numerical values have been rounded to one decimal place. Thus, the estimated value 
of the spring constant is b " ~ 1 4 pounds/inch. 


b Temperature of Venusian 
450 Atmosphere 


Magellan orbit 3213 
350 Date: 5 October 1991 
Latitude: 67 N 


LTST: 22:05 


Temperature Г (К) 


00 " 
30 40 50 60 70 $80 90100 
Altitude л (km) 
Source: NASA 


Historical Note On October 5, 1991 the Magellan spacecraft entered the atmosphere of Venus and 
transmitted thetemperature T in kelvins (K) versus the altitude Л in kilometers (km) until its signal 
was lost at an altitude of about 34 km. Discounting theinitial erratic signal, the data strongly 
suggested a linear relationship, so a least squares straight line fit was used on the linear part of the 
data to obtain the equation 


T—737.5 — 8.125% 
By setting д = 0 in this equation, the surface temperature of Venus was estimated at 7 = 737. 5K. 


Least Squares Fit of a Polynomial 


The technique described for fitting a straight line to data points can be generalized to fitting a polynomial of 
specified degree to data points. Let us attempt to fit a polynomial of fixed degree m 


y —ag + aix +... am” (8) 


to n points 


(х1, yp. (х2, y2); ==" (ху, Yn) 
Substituting these n values of x and y into 8 yields ће n equations 


yi = ар c axp +...+ aw] 
Y2 = ар + х2 +... аху 
m 
Yn = арй + AX, +... amn 
or, in matrix form, 
y= Mv (9) 


where 


1 2 m 


yı AB SY, -— 5] аб 
2 m a 

у= 7 , M=|! #2 73 — X| v= (10) 
Yn Н : Н : аз 


1 x, xe... х" 
As before, the solutions of the normal equations 
Mi My=M'y 
determine the coefficients of the polynomial, and the vector v minimizes 
lly — Mv 


Conditions that guarantee the invertibility of Д 7 f are discussed in the exercises (Exercise 7). If M 7 M is 
invertible, then the normal equations have a unique solution y — y *, which is given by 


8 (мтму “my (11) 


EXAMPLE 3 Fitting a Quadratic Curve to Data + 


According to Newton's second law of motion, a body near the Earth's surface falls vertically 
downward according to the equation 


s— s +vot4 280 (12) 
where 
s = vertical displacement downward relative to some fixed point 
50 = initial displacement at time ғ — 0 
V0 = initial velocity at time ғ = 0 
g =acceleration of gravity at the Earth's surface 


from Equation 12 by releasing a weight with unknown initial displacement and velocity and 
measuring the distance it has fallen at certain times relative to a fixed reference point. Suppose 
that a laboratory experiment is performed to evaluate g. Suppose it is found that at times 
#=.1,.2, .3, .4, and .5 seconds the weight has fallen s = = 0.18, 0.31, 1.03, 2.48, and 3.73 
feet, respectively, from the reference point. Find an approximate value of g using these data. 


Solution The mathematical problem is to fit a quadratic curve 
= 2 
s=ag+ayt+ at (13) 


to the five data points: 
C1, 2018), (2,031), (3,103), (4,248), (5, 3.73) 


With the appropriate adjustments in notation, the matrices M and y in 10 are 


jo s 
1 2} [12 9 sı] | -0.18 
2| |1 2 .04 52 0.31 
М=|1 t3 @|=|1 3 09. y=|s3|=| 103 
2| |1 4 .16 54 2.48 
1 ід £ 
: p 3 25 55 373 
l 45 Е 
Thus, from 11, 
„| P —0.40 
= [а (MTM) M'ys| 035 
x 16.1 
m 


From 12 and 13, we have 25 — ig so the estimated value of g is 
g = 2a; = 2(16.1) = 32.2 feet / second? 
If desired, we can also estimate the initial displacement and initial velocity of the weight: 
50 = ag = — 0.40 feet 
Ур = al = 0.35 feet / second 


In Figure 6.5.5 we have plotted the five data points and the approximating polynomial. 


Distance s (in feet) 


Time ¢ {in seconds) 


Figure 6.5.5 


Concept Review 
* Least squares straight line fit 
* Regression line 


* Least squares polynomial fit 


Skills 


* Find the least squares straight line fit to a set of data points. 
* Find the least squares polynomial fit to a set of data points. 


* Use the techniques of this section to solve applied problems. 


Exercise Set 6.5 


1. Find the least squares straight line fit to the three points (0, 0), (1, 2), and (2, 7). 
Answer: 
mda 
у= 2 I 5^ 
2. Find the least squares straight line fit to the four points (0, 1), (2, 0), (3, 1), and (3, 2). 


Ge 


. Find the quadratic polynomial that best fits the four points (2, 0), (3, — 10), (5, —48), and (6, — 76). 


Answer: 


y =2 + 5х - 3x? 


. Find the cubic polynomial that best fits the five points ( = 1, = 14), (0, = 5), (1, = 4), (2, 1), and 
(3, 22). 
. Show that the matrix M in Equation 2 has linearly independent columns if and only if at least two of the 


numbers x 1, X2, ..., Ху are distinct. 


. Show that the columns of the м x (#2 + 1) matrix M in Equation 10 are linearly independent if у > р; and 


at least р -+ 1 of the numbers х, x2, ..., Xy are distinct. [Hint: A nonzero polynomial of degreem has at 
most m distinct roots.] 


. Let M be the matrix in Equation 10. Using Exercise 6, show that a sufficient condition for the matrix 


МТМ to be invertible is that у; > р and that at least jo; + 1 of the numbers х, x2, ..., ху are distinct. 


. The owner of a rapidly expanding business finds that for the first five months of the year the sales (in 


thousands) are $4.0, $4.4, $5.2, $6.4, and $8.0. The owner plots these figures on a graph and conjectures 
that for the rest of the year, the sales curve can be approximated by a quadratic polynomial. Find the least 
squares quadratic polynomial fit to the sales curve, and use it to project the sales for the twelfth month of 
the year. 


. А corporation obtains the following data relating the number of sales representatives on its staff to annual 


Number of 
Sales Representatives 5 10 | 15 | 20 | 25 | 30 


sales: 


esses [3л [i [sre [o [nn 


Explain how you might use least squares methods to estimate the annual sales with 45 representatives, and 
discuss the assumptions that you are making. (You need not perform the actual computations.) 


10. Pathfinder is an experimental, lightweight,remotely piloted,solar-powered aircraft that was used in aseries 


11. 


of experiments by NASA to determine the feasibilityof applyingsolar power for long-duration,high- 
altitude flight. In August 1997 Pathfinder recordedthe data in the accompanying table relating altitude H 
and temperature 7. Show that a linear model is reasonable by plotting the data, and then find theleast 
squares line Æ = Hy + ЕТ of best fit. 


Table Ex-10 
Altitude H 
(thousands of feet)| 15 | 20 25 


30 35 40 45 
Temperature T 
(°С) 4.5|—5.9[—1 27.6|—39.8|—50.2| 62.9 


Find a curve of the form y = g + (4 / x) that best fits the data points (1, 7), (3, 3), (6, 1) by making the 
substitution X — 1 / x. Draw the curve and plot the data points in the same coordinate system. 


6.1 


Answer: 
L3 4 48 
Y= 51 TT 


True-False Exercises 


In parts (a)-(d) determine whether the statement is true or false, and justify your answer. 


(с) 


(a) Every set of data points has a unique least squares straight line fit. 


Answer: 


False 


(b) If the data points (x1, у), (x2, Y2), -... (Ху. Ум) are not collinear, then 1 is an inconsistent system. 


Answer: 


True 


If y = а + bx is the least squares line fit to the data points (x1, ¥1), (x2, ¥2),--+ (Xy Yn) then 
dj = |y; — (a + bx) | is minimal for every 1 <i < и. 


Answer: 


False 


(d) If y = д + bx is the least squares line fit to the data points (x1, у1), (х2, ¥2),--. (Ху. Yn) then 
n 
У) b — (a 4- bx д is minimal. 
i=1 


Answer: 


True 
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6.6 Function Approximation; Fourier Series 


In this section we will show orthogonal projections can be used to approximate certain types of functions by 
simpler functions that are easier to work with. The ideas explained here have important applications in 
engineering and science. Calculus is required. 


Best Approximations 


АП of the problems that we will study in this section will be special cases of the following general problem. 


APPROXIMATION PROBLEM 


Given a function f'that is continuous on an interval [a, 5], find the “best possible approximation" to f 
using only functions from a specified subspace W of C [a, Ё]. 


Here are some examples of such problems: 
(a) Find the best possible approximation to г? over [0, 1] by a polynomial of the form ар + a4x 4 aax 


(b) Find the best possible approximation to sinzx over [ — 1, 1] by a function of the form 
20 + aye” і азе? і aze”. 
(c) Find the best possible approximation to x over [0, 27] by a function of the form 
ag + asm x + a3sin 2x + соз x + b3cos 2x. 
In the first example W is the subspace of C [0, 1] spanned by 1, x, and x2; in the second example W is the 
subspace of C[ — 1, 1] spanned by 1, 2”, 22%, and „37; and in the third example W is the subspace of 


С[0, Zw] spanned by 1, sin x, sin 2х, cos х, and cos 2х. 


Measurements of Error 


To solve approximation problems of the preceding types, we first need to make the phrase “best 
approximation over [a, 5]" mathematically precise. To do this we will need some way of quantifying the 
error that results when one continuous function is approximated by another over an interval [a, 5]. If we 
were to approximate f (x) by g(x), and if we were concerned only with the error in that approximation at a 
single point х0, then it would be natural to define the error to be 

error = |7 (xo) — &(xo)| 
sometimes called the deviation between f and g at хр (Figure 6.6.1). However, we are not concerned simply 
with measuring the error at a single point but rather with measuring it over the entire interval [a, b]. The 
problem is that an approximation may have small deviations in one part of the interval and large deviations in 
another. One possible way of accounting for this is to integrate the deviation | (x) — g(x) | over the interval 
[a, b] and define the error over the interval to be 


JF (x) -g(x)|dx (1) 


error = [ 
a 


Geometrically, 1 is the area between the graphs of f (x) and g(x) over the interval [a, 5] (Figure 6.6.2); the 
greater the area, the greater the overall error. 


| ДЕ gl vol 


a ы Ь 


Figure 6.6.1 The deviation between / апа g хо 


a b 
Figure 6.6.2 The area between the graphs of f and g over [a, b] measures the error in approximating f 


by е over [a, 5] 


Although 1 is natural and appealing geometrically, most mathematicians and scientists generally favor the 
following alternative measure of error, called the mean square error: 


mean square error — f [f£ (х) — g(x) ]? dx 


Mean square error emphasizes the effect of larger errors because of the squaring and has the added advantage 
that it allows us to bring to bear the theory of inner product spaces. To see how, suppose that f is a continuous 
function on [a, 5] that we want to approximate by a function g from a subspace W of Ca, 5], and suppose 
that C [a, 5] is given the inner product 


if. g)= [ f G)g(x) dx 


It follows that 


If — gi? — (f -«r-g- f Uo — gx)]? dx = mean square error 


so minimizing the mean square error is the same as minimizing ||f — g|| 2 Thus the approximation problem 


posed informally at the beginning of this section can be restated more precisely as follows. 


Least Squares Approximation 


LEAST SQUARES APPROXIMATION PROBLEM 


Let f be a function that is continuous on an interval [a, 5], let C[a, b] have the inner product 


(f.g)— [уво ах 


and let W be a finite-dimensional subspace of C [a, 5]. Find a function g in W that minimizes 


е-е [Ure - sor as 


Since ||f — gl and ||f — g|| are minimized by the same function g, this problem is equivalent to looking for a 


function g in W that is closest to f. But we know from Theorem 6.4.1 that g = ргоўду f is such a function 
(Figure 6.6.3). 


f = function in C[a, b] 
to be approximated 


g = proj wf = least squares 
approximation 
subspace of to f from W 
approximating 
functions 


Ww 


Figure 6.6.3 


Thus, we have the following result. 


THEOREM 6.6.1 


If f is a continuous function on [a, 5], and W is a finite-dimensional subspace of C [a, è], then the 
function g in W that minimizes the mean square error 


[ Lf (x) = 262]? dx 


is g = props f, where the orthogonal projection is relative to the inner product 
g = projyr g proj p 


(f, g)= [i f G)g(x) dx 


The function g = prog f is called the last squares approximation to f from W. 


Fourier Series 


A function of the form 
T(x) —cg--c4cos x +9008 2x + * * * +c,cos ix + disin x --d2sin 2x + +++ +dysinux (2) 


is called a trigonometric polynomial; if cy, and @,, are not both zero, then 7'(x) is said to have order n. For 
example, 
T(x) —2-- cos x — 3 cos 2x + 7 sin 4x 
is a trigonometric polynomial of order 4 with 
co = 2, c1 = 1, c2 = —3,¢3=0,c4=0, d4—0,d45—0,d45—0,d4—7 


It is evident from 2 that the trigonometric polynomials of order n or less are the various possible linear 
combinations of 


1, cosx, cos 2x,.., cos MX, sinx, sin 2X,..., sin AX (3) 


It can be shown that these 2» + 1 functions are linearly independent and thus form a basis for a (2% + 1) 
-dimensional subspace of C [a, 5]. 


Let us now consider the problem of finding the least squares approximation of a continuous function f (x) 
over the interval [0, 27] by a trigonometric polynomial of order n or less. As noted above, the least squares 
approximation to f from W is the orthogonal projection of f on W. To find this orthogonal projection, we must 
find an orthonormal basis gp, g1, ..., ©2 for W, after which we can compute the orthogonal projection on W 
from the formula 


prow f = (f, goygo + (f. g1)gi 55+ H (f. а2„\@2и (4) 


(see Theorem 6.3.45). An orthonormal basis for W can be obtained by applying the Gram-Schmidt process to 
the basis vectors in 3 using the inner product 


27 
(fe) | 00е) dx 


This yields the orthonormal basis 
1 EM 


1 
g0 =-=, 21 = = COS Х,.., By — —— cos xx, 
y 20 үт й үт 


1. 1. (5) 
ee шый = шш 


(see Exercise 6). If we introduce the notation 


2 1 1 
ag = =f. 80|, 21 = =f. gi a, = —r-|f.g 
Ол yr n " Hn 
i (6) 
bi T f, 8n+1 l; --» Ёа =—т=|Ё, 82и 
then on substituting 5 in 4, we obtain 
prof = 2 + [acos x + +++ +aycos их] + [b1sin x + ++ + +4y,sm xx] (7) 
where 
ap = Alf, go = MIT drei f G) dx 
V2s Qn V2s 
1 1 
E gi =." овен Tna 
1 1 2T 
ay, = —=|f, g, =f JG) cos mx ds 1| J (x) cos nx dx 
Ja | ir т | 
by = LIF, gn -1f Malo xe ue sin x dx 
үт үте ` yr "о 
1 M жый PTO 1 р" 
by, = —If, go -+/ J (x)= smauxdx=—] f(x) пих dx 
Sel d e/o ya “Jo 
In short, 
1 2" 1 2 
= ij J (x)cos kx dx, bk = 1/ J (x)sin kx dx (8) 
aw 0 a 0 


The numbers ap, 21, ..., @y, 21, ..., By are called the Fourier coefficients of f. 
EXAMPLE 1 Least Squares Approximations — 


Find the least squares approximation of f (x) = x on [0, 27] by 
(а) atrigonometric polynomial of order 2 or less; 


(b) a trigonometric polynomial of order n or less. 


Solution 

(a) 1 2T 1 2T 

ай== 7ф)ахк=}[ хах=2т (9а) 
a 0 a 0 


For = 1, 2, ..., integration by parts yields (verify) 


(b 


— 


1 2T 1 2т 
ак = =f f(x) cos kx dx =} | x cos kx dx = 0 (9b) 
W 0 a 0 


2т 1 2т 2 
bk = = Ff (x)sinkx dx = 1 | x sin kx dx = — + (9c) 
0 “Jo 


Thus, the least squares approximation to x on [0, 27] by a trigonometric polynomial of 
order 2 or less is 


xa Zn + ajcos x + a3cos 2x + hisin x + 49 sin 2x 


or, from (9a), (9b), and (9c), 
хат 2 sin x = зп 2x 
The least squares approximation to x on [0, 27] by a trigonometric polynomial of order n 


or less is 


м2 + [ajcosx+ * *  Фаџсоѕ их] + [bisina H e + + +,sin 2x] 


or, from (9a), (9b), and (9c), 


T 2 (in x + Е ‚эк, i3 + лк) 


The graphs of у = x and some of these approximations are shown in Figure 6.6.4. 


aw —2(sinx + S22 4 sind ain 4х ) 
- E 


pH + 
6 3 4 | 


a 
sin x + 81925 sin 2x + — | 


un 


sin x + 282 2) 


N 


Figure 6.6.4 


It is natural to expect that the mean square error will diminish as the number of terms in the 
least squares approximation 


(х) m E + È (agcos kx + bysin kx) 


increases. It can be proved that for on fin C [0, 27], the mean square error 
approaches zero as » — = oo; this is denoted by writing 


f(x) = + Y^ (agcos kx + bysin kx) 
k=l 


The right side of this equation is called the Fourier series for f over the interval [ 0, 27]. 
Such series are of major importance in engineering, science, and mathematics. 


Jean Baptiste Fourier (1768—1830) 


Historical Note Fourier was a French mathematician and physicist who discovered 
the Fourier series and related ideas while working on problems of heat diffusion. This 
discovery was one of the most influential in the history of mathematics; it is the 
cornerstone of many fields of mathematical research and a basic tool in many branches 
of engineering. Fourier, a political activist during the French revolution, spent time in 
jail for his defense of many victims during the Terror. He later became a favorite of 
Napoleon and was named a baron. 

[/mage: The Granger Collection, New York] 


Concept Review 

* Approximation of functions 
* Mean square error 

* Least squares approximation 
* Trigonometric polynomial 

* Fourier coefficients 


* Fourier series 


Skills 
* Find the least squares approximation of a function. 
* Find the mean square error of the least squares approximation of a function. 


* Compute the Fourier series of a function. 


Exercise Set 6.6 


1. Find the least squares approximation of f (x) = 1 + x over the interval [0, 27] by 
(a) a trigonometric polynomial of order 2 or less. 


(b) a trigonometric polynomial of order n or less. 


Answer: 


(a) (1 +T) — 2 sin x — sin 2x 


b EC sin 2x | sin 3x sin x 
(b) (145) 2| sin x 4 2. na zd 


2 


2. Find the least squares approximation of f (x) = x^ over the interval [0, 27] by 


(a) a trigonometric polynomial of order 3 or less. 


(b) a trigonometric polynomial of order п or less. 


3. (a) Find the least squares approximation of x over the interval [ 0, 1] by a function of the form a + be’. 


(b) Find the mean square error of the approximation. 


Answer: 


12 2(1—2) 
*(a) Find the least squares approximation of e" over the interval [0, 1] by a polynomial of the form 
ag + ax. 


(b) Find the mean square error of the approximation. 


5. (a) Find the least squares approximation of sin, тх over the interval [-1, 1] by a polynomial of the form 
2 
ап + aix + ax 


(b) Find the mean square error of the approximation. 
Answer: 


(a) 3x 
6 
(b) 1— = 


6. Use the Gram—Schmidt process to obtain the orthonormal basis 5 from the basis 3. 
7. Carry out the integrations indicated in Formulas 9a, 9b, and 9c. 


8. Find the Fourier series of f (x) =m — x over the interval [0, 27]. 


9. Find the Fourier series of f (x) = 1, 0 < x < s and f (x) = 0, m « x < 2s over the interval [0, 27]. 
Answer: 
l4 Lu - - 1)* sin kx 

10. What is the Fourier series of sin(3x)? 

True-False Exercises 

In parts (а)—(е) determine whether the statement is true or false, and justify your answer. 


(a) Ifa function f in C [a, 5] is approximated by the function g, then the mean square error is the same as the 
area between the graphs of f (x) and g(x) over the interval [a, 5]. 


Answer: 


False 
(b) Given a finite-dimensional subspace W of Ca, 5], the function g = projw f minimizes the mean square 


error. 
Answer: 


True 


(с) {1, cosx, sinx, cos2x, sin2x} is an orthogonal subset of the vector space C [0, 2x] with respect to the 


2т 
inner product (Е, в) = А J (x)g(x) ах. 


Answer: 


True 


(d) (1, cosx, sinx, cos2x, sin2x) is an orthonormal subset of the vector space C'[O, 2x] with respect to the 


2T 
inner product (f. g) — Я Ў (xjg(x) dx. 


Answer: 


False 


(e) (1, cosx, sinx, cos2x, sin2x} is a linearly independent subset of C [0, 27]. 
Answer: 


True 
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Chapter 6 Supplementary Exercises 


. Let д4 have the Euclidean inner product. 


(a) Find a vector in R4 that is orthogonal to uj = (1, 0, 0, 0) and u4 = (0, 0, 0, 1) and makes equal 
angles with uz = (0, 1, 0, 0) and uz = (0, 0, 1, 0). 


(b) Find a vector x = (x1, x2, x3, x4) of length 1 that is orthogonal to Uj and ид above and such that the 
cosine of the angle between x and u3 is twice the cosine of the angle between х and u3. 


Answer: 
(а) (0, а, a, 0) witha #0 
(b) RI) 
05 y5 
. Prove: If (u, v) is the Euclidean inner product on R”, and if A is an y x » matrix, then 


lu, А) = (А7, y) 


| Hint: Use the fact that lu, v) =u-v= ута] 


* Let Maz have the inner product | С, А =fr (u 7) = А 70) that was defined in Example 6 of 


Section 6.1 . Describe the orthogonal complement of 
(a) the subspace of all diagonal matrices. 


(b) the subspace of symmetric matrices. 
Answer: 
(a) The subspace of all matrices in M 2? with only zeros on the diagonal. 


(b) The subspace of all skew-symmetric matrices in 44 55. 


. Let Ах = 0 be a system of m equations in n unknowns. Show that 
X1 


is a solution of this system if and only if the vector x = (x1, x2, ..., Ху) is orthogonal to every row vector 
of A with respect to the Euclidean inner product on R". 


. Use the Cauchy-Schwarz inequality to show that if a1, 42, ..., à, are positive real numbers, then 
ae A xl gs oe | ee 
(a +а2+ + ay) [s + 22 + T ак | 2 


. Show that if x and y are vectors in an inner product space and с is any scalar, then 


lex + yl? = 22112 + 2elx, у} + yl? 


7. Let R? have the Euclidean inner product. Find two vectors of length 1 that are orthogonal to all three of 
the vectors uy = (1, 1, = 1), uz = (= 2, —1, 2), and u3 = ( — 1, 0, 1). 


Answer: 
1 1 
+ |——, 0, —= 
| {2 y2 | 
8. Find a weighted Euclidean inner product on #”' such that the vectors 
уу = TL UO. 0) 


(о, V2. 0, .... 0) 
уз = (0, 0, V3, .... 0) 


" = (0.0.0... у») 


v2 


form an orthonormal set. 
9. Is there a weighted Euclidean inner product on & for which the vectors (1, 2) and (3, — 1) form an 
orthonormal set? Justify your answer. 


Answer: 


No 


10. If u and v are vectors in an inner product space Į”, then u, v, and y — y can be regarded as sides of a 


"triangle" in V (see the accompanying figure). Prove that the law of cosines holds for any such triangle; 
that is, 


2 2 2 
[ач — у] = lull” + 1112 — 210111 ||соз 8 
where @ is the angle between и and v. 


Figure Ex-10 


H. (a) As shown in Figure 3.2.6, the vectors (k, 0, 0), (0, k, 0), and (0, 0, k) form the edges of a cube in R? 
with diagonal (&, &, &). Similarly, the vectors 


(x, 0, 0,..., 0), (0,4, 0,...,0),.., (0,0,0,..,4) 


can be regarded as edges of a “cube” іп R" with diagonal (Ж, &, &, ..., &). Show that each of the above 
edges makes an angle of 0 with the diagonal, where cos 0 = 1 / "P 


(b) Calculus required What happens to the angle 0 inpart (a) as the dimension of R” approaches 4-9? 


12. 


13. 


14. 


15. 
16. 


17. 


18. 


19. 


Answer: 


LL] 


(b) 8 approaches 5 


Let u and v be vectors in an inner product space. 
(a) Prove that ||u|| = ||v|| if and only if u + y and y — ұ are orthogonal. 
(b) Give a geometric interpretation of this result in 22 with the Euclidean inner product. 


Let u be a vector in an inner product space V, and let (v4, v2, ..., Vyp} be an orthonormal basis for V. 
Show that if a; is the angle between u and Vi, then 


соза | cos ^a Free соза =] 
Prove: If (u, v3, and (u, v)» are two inner products on a vector space V, then the quantity 
(u, v) = (u, v), + (u, v)» is also an inner product. 
Prove Theorem 6.2.5. 


Prove: If А has linearly independent column vectors, and if b is orthogonal to the column space of A,then 
the least squares solution of 4x = h 1s x = 0. 


Is there any value of s for which x; = 1 and х2 = 2 is the leastsquares solution of the following linear 
system? 


х\ = x = 1 
2x1 + 3x2 = 1 
4х] + 5x2 = 8 


Explain your reasoning. 
Answer: 


No 


Show that if p and q are distinct positive integers, then the functions f (x) = sin px and g(x) = sin gx аге 
orthogonal with respect to the inner product 


2T 
(fe) | Sel) dx 


Show that if p and q are positive integers, then the functions f (x) = cos px and g(x) = sin gx are 
orthogonal with respect to the inner product 


27 
(.}= [| SOE) ах 
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Diagonalization and 
Quadratic Forms 


CHAPTER CONTENTS 


7.1. Orthogonal Matrices 

7.2. Orthogonal Diagonalization 

7.3. Quadratic Forms 

7.4. Optimization Using Quadratic Forms 


7.5. Hermitian, Unitary, and Normal Matrices 


INTRODUCTION 


In Section 5.2 we found conditions that guaranteed the diagonalizability of an у x х 
matrix, but we did not consider what class or classes of matrices might actually satisfy 
those conditions. In this chapter we will show that every symmetric matrix is 
diagonalizable. This is an extremely important result because many applications utilize it 
in some essential way. 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


7.1 Orthogonal Matrices 


In this section we will discuss the class of matrices whose inverses can be obtained by transposition. Such matrices occur in a variety of 
applications and arise as well as transition matrices when one orthonormal basis is changed to another. 


Orthogonal Matrices 


We begin with the following definition. 


DEFINITION 1 
A square matrix A is said to be orthogonal if its transpose is the same as its inverse, that is, if 
A71 4T 
or, equivalently, if 
AAT = АТА=1 


Recall from Theorem 1.6.3 that if either product in 1 holds, then 
so does the other. Thus, A is orthogonal if either 447 = гог 


ATA- I. 


EXAMPLE 1 A3 x3 Orthogonal Matrix — 


The matrix 


22 6 
7 7 7 
-|-é 2 2 
a= Wy. Үү 
26 3 
77 7 
is orthogonal since 
2.6 2132 6 
n2 3 61 ез ы 
seal 7 €. lx TET 
$ 2 3|| 26 _3 
qd Ж О 3 7 7 7 


EXAMPLE 2 Rotation and Reflection Matrices are Orthogonal + 


Recall from Table Table 5 of Section 4.9 that the standard matrix for the counterclockwise rotation of 2? through an angle Ө is 
д | < Ө —sn 
sin cosé 


This matrix is orthogonal for all choices of 0 since 


АТА cos@ sind || cosÓ —siné = 10 
=sin cosÓ || sn cos 0 1 


We leave it for you to verify that the reflection matrices in Tables Table 1 and Table 2 and the rotation matrices in Table Table 6 of 


Section 4.9 are all orthogonal. 


Observe that for the orthogonal matrices in Example 1 and Example 2, both the row vectors and the column vectors form orthonormal sets with 
respect to the Euclidean inner product. This is a consequence of the following theorem. 


THEOREM 7.1.1 


The following are equivalent for an з х »; matrix A. 
(a) Ais orthogonal 
(b) The row vectors of A form an orthonormal set in R” with the Euclidean inner product. 


c) The column vectors of А form an orthonormal set in R" with the Euclidean inner product. 
p 


Proof We will prove the equivalence of (a) and (5) and leave the equivalence of (a) and (c) as an exercise. 


(a) €» (b) The entry in the ith row and jth column of the matrix product 447 is the dot product of the ith row vector of А and the jth column 
vector of 47 (see Formula 5 of Section 1.3). But except for a difference in form, the jth column vector of 47 is the jth row vector of A. Thus, if the 
row vectors of А are rj, r2, ..., Гу, then the matrix product 447 can be expressed as 


ry Ty ri'r ... Е 
r?'rj rg'rg ... ГЕ 
AAT = 2° 1 2 2 2° n 
гург) Y4cI1I2 2. Ту Гу 


[see Formula 28 of Section 3.2]. Thus, it follows that 447 = 7 if and only if 


rjp'rj—r2:rj2—..—r,:r4—1 


and 
rj: rj =O wheni 4 j 


which are true if and only if (r1, гэ, ....r,} is an orthonormal set in R”. 


WARNING 


Note that an orthogonal matrix is one with orthonormal rows and columns—not simply orthogonal rows and columns. 


The following theorem lists three more fundamental properties of orthogonal matrices. The proofs are all straightforward and are left as exercises. 


THEOREM 7.1.2 


(a) The inverse of an orthogonal matrix is orthogonal. 
(b) A product of orthogonal matrices is orthogonal. 
(c) IfA is orthogonal, then det(A) = 1 or det(A) = — 1. 


EXAMPLE 3 det(A) = £1 for an Orthogonal Matrix A + 


The matrix 


is orthogonal since its row (and column) vectors form orthonormal sets in R? with the Euclidean inner product. We leave it for you 
to verify that det(A) = 1 and that interchanging the rows produces an orthogonal matrix whose determinant is —1. 


Orthogonal Matrices as Linear Operators 


We observed in Example 2 that the standard matrices for the basic reflection and rotation operators on g? and 23 are orthogonal. The next theorem 
will explain why this is so. 


THEOREM 7.1.3 


If A is ап у x matrix, then the following are equivalent. 
(a) Ais orthogonal. 

(b) ||Ax|| = ||x]| for all x in А”. 

(c) Ax- Ау — x y for all x and y in R”. 


Proof We will prove the sequence of implications (a) = (b) > (c) > (a). 


(а) = (b) Assume that A is orthogonal, so that 47 4 = 7. It follows from Formula 26 of Section 3.2 that 
1/2 
| Axl] = Cx - 4517 = (x AT Ax) = (ex)? | 
(b) = (c) Assume that || 4x|| = ||x|| for all x in R”. From Theorem 3.2.7 we have 
Ax Ay = ША Ау — 1|Ах— yl? = Taxe y]? -a-y 
4 4 4 4 
=. 1 "rp m 
= glx-*yl' = д1 У =x-y 
(c) = (а) Assume that Ax - Ау = x * y for all x and y in R”. It follows from Formula 26 of Section 3.2 that 
x'y=x- AT ay 
which can be rewritten as X * (4 T Ay = y) = Ü or as 
x: (474—Z}y = 0 
| | ) "EET | Gulp tar 
Since this equation holds for all x in R”, it holds in particular if X = |A" A—/ ly, so 
(474—J}y (АТА -Dy- 0 
Thus, it follows from the positivity axiom for inner products that 
(474 = Dy =0 


Since this equation is satisfied by every vector y in R”, it must be that 4 T A — I is the zero matrix (why?) and hence that А T А = І. Thus, A is 
orthogonal. 


Theorem 7.1.3 has a useful geometric interpretation when considered from the viewpoint of matrix transformations: If A is an orthogonal matrix 
and T д: R” — R” is multiplication by A, then we will call 7 д ап orthogonal operator on R”. It follows from parts (a) and (b) of Theorem 7.1.3 
that the orthogonal operators on &" are precisely those operators that leave the lengths of all vectors unchanged. This explains why, in Example 2, 
we found the standard matrices for the basic reflections and rotations of 22 and R? to be orthogonal. 


Parts (a) and (c) of Theorem 7.1.3 imply that orthogonal 
operators leave the angle between two vectors unchanged. Why? 


Change of Orthonormal Basis 


Orthonormal bases for inner product spaces are convenient because, as the following theorem shows, many familiar formulas hold for such bases. 
We leave the proof as an exercise. 


THEOREM 7.1.4 


If 5 is an orthonormal basis for an n-dimensional inner product space V, and if 


(а) у= (u1, 22, .., ux) and (v) g— (v1, v2, ... vy) 


then: 
@ ful — yutu - - - ud 

= 2 2 2 
© af v) = VG - v3 + Ga 02)? + + + + Gn m vo 
(c) (п, V) = u1v| В 4272 * * Puyn 


Remark Note that the three parts of Theorem 7.1.4 can be expressed as 


llull = || fu) sll 2 Cu, v) = d (u) 5, (v) 5) (u, v) = (Q0 s. (У) s) 


where the norm, distance, and inner product on the left sides are relative to the inner product on V and on the right sides are relative to the 
Euclidean inner product on R”. 


Transitions between orthonormal bases for an inner product space are of special importance in geometry and various applications. The following 
theorem, whose proof is deferred to the end of this section, is concerned with transitions of this type. 


THEOREM 7.1.5 


Let V be a finite-dimensional inner product space. If P is the transition matrix from one orthonormal basis for V to another orthonormal 
basis for V, then P is an orthogonal matrix. 


EXAMPLE 4 Rotation of Axes in 2-85pace <4 


In many problems a rectangular xy-coordinate system is given, and a new x ' y ' -coordinate system is obtained by rotating the 
xy-system counterclockwise about the origin through an angle 0. When this is done, each point Q in the plane has two sets of 

: А , Р PO! } : 
coordinates—coordinates (x, y) relative to the xy-system and coordinates (x ON ) relative to the x' y'-system (Figure 7.1.1a). 


cos [o +f) 


(а) (b) (c) (d) 


Figure 7.1.1 


By introducing unit vectors uj and из along the positive x- and y-axes and unit vectors uj and ш along ће positive х'- and y'-axes, 
А А А ‚в! ШИП ; 
we can regard this rotation as a change from an old basis 8 = (uj, шз) to a new basis 2 = fu, vl (Figure 7.1.15). Thus, the new 


coordinates (x 5 Э and the old coordinates (x, y) ofa point О will be related by 


Maa o 


where P is the transition from B’ to В. To find P we must determine the coordinate matrices of the new basis vectors uj and ш 
relative to the old basis. As indicated in Figure 7.1.1c, the components of uj in the old basis are cos 0 and sin 0, so 


П cos B 
ul.-|. 
[ 1]в | Б | 
Similarly, from Figure 7.1.14, we see that the components of ш in the old basis are соз(0 + m} 2) = — sin Ó and 


sin(B + m / 2) = cos B, so 
гу _|— ш 
[®]в= | cos А 


Thus the transition matrix from B’ to B is 


P= cosh -—sn 3 
sn@ cosé 6) 


Observe that P is an orthogonal matrix, as expected, since B and В' are orthonormal bases. Thus 


РЗ ерт cos Ө wl 


—sinÜ cos 

so 2 yields 

x! _ | cos# Е: 4 

y! —sinÜ cos ||" x 
or, equivalently, 

x = x cos B +y sin Ü 

А | (5) 
y = =xsnd+ycosé 


These are sometimes called the rotation equations for д2. 


EXAMPLE 5 Rotation of Axes in 2-Space <4 


Use form 4 of the rotation equations for д2 to find the new coordinates of the point (2, 1) if the coordinate axes of a rectangular 


coordinate system are rotated through an angle of # = т / 4. 


Solution Since 


the equation in 4 becomes 


M КЕ 8 р 


ME Рр T | л РЛ 
>! _1 1-1 __3_ 
y2 {2 y2 


so the new coordinates of Q are х',у' |= 2. = : 
{> ү 


—— 


Remark Observe that the coefficient matrix in 4 is the same as the standard matrix for the linear operator that rotates the vectors of 22 through 


the angle — (see margin note for Table 5 of Section 4.9). This is to be expected since rotating the coordinate axes through the angle 0 with the 
vectors of 22 kept fixed has the same effect as rotating the vectors in R? through the angle —@ with the axes kept fixed. 


EXAMPLE 6 Application to Rotation of Axes in 3-Space — 


Suppose that a rectangular xyz-coordinate system is rotated around its z-axis counterclockwise (looking down the positive z-axis) 
through an angle 0 (Figure 7.1.2). If we introduce unit vectors Uj, U2, and из along the positive x-, y-, and z-axes and unit vectors uj : 
ш, апа аз along the positive x"-, у'-, and z'-axes, we can regard the rotation as a change from the old basis 8 = {uj, U2, uz} to ће 


new basis 5 бе {fuj А ш, ч }. In light of Example 4, it should be evident that 


cos @ —sin ý 
[i]: sinf | and [%]5= cos B 
0 0 


Moreover, since u extends 1 unit up the positive z'-axis, 


T 0 
[9 ]5= |0 
1 


Figure 7.1.2 
It follows that the transition matrix from B’ to B is 
cos# —=sinð 0 
P-—|snÓ cos# 0 
0 0 1 
and the transition matrix from B to B’ is 
cos sin@ 0 
P! =| —sinð cos@ 0 
0 0 1 


(verify). Thus, the new coordinates (x e г) of a point О can be computed from its old coordinates (x, у, z) by 


x cos snð О|[х 
y! =| =sinf cos# ОУ 
x 0 0 12 


OPTIONAL 


We conclude this section with an optional proof of Theorem 7.1.5. 


Proof of Theorem 7.1.5 Assume that V is an n-dimensional inner product space and that Р is the transition matrix from an orthonormal basis 


B' to an orthonormal basis В. We will denote the norm relative to the inner product on V by the symbol || || p to distinguish it from the norm 
relative to the Euclidean inner product on 2”, which we will denote by || ||. 


Recall that (и) < denotes a coordinate vector expressed in 
comma-delimited form whereas [u] < denotes a coordinate vector 
expressed in column form. 


To prove that P is orthogonal, we will use Theorem 7.1.3 and show that || Px|| = ||x|| for every vector x in 5. As a first step in this direction, 


recall from Theorem 7.1.4a that for any orthonormal basis for V the norm of any vector u in V is the same as the norm of its coordinate vector with 


respect to the Euclidean inner product, that is 


lull y= I| [a] gll = II [u] gli 


or 
lull = I| [u] gll = IP [u] gll (6) 


Now let x be any vector in R”, and let u be the vector in V whose coordinate vector with respect to the basis B' is x; that is, [u] p: = x. Thus, from 
6 


Д 


lall = 11511 = Рх 


which proves that P is orthogonal. 


Concept Review 

* Orthogonal matrix 

* Orthogonal operator 

* Properties of orthogonal matrices. 

* Geometric properties of an orthogonal operator 


* Properties of transition matrices from one orthonormal basis to another. 


Skills 
* Be able to identify an orthogonal matrix. 
* Know the possible values for the determinant of an orthogonal matrix. 


* Find the new coordinates of a point resulting from a rotation of axes. 


Exercise Set 7.1 


1. (a) Show that the matrix 


4 3 
5 0 5 
9 4 12 
A=| -3 5 735 
12 3 16 
25 5 25 


is orthogonal in three ways: by calculating 47 4, by using part (5) of Theorem 7.1.1, and by using part (c) of Theorem 7.1.1. 


(b) Find the inverse of the matrix A in part (a). 


Answer: 
© 4 9 12 
5 25 25 
4 3 
d 5 5 
23 _12 16 
5 25 25 


2. (a) Show that the matrix 


| 
l 
шз. wi win 


WINY WIN We 


WIP bof wji 


is orthogonal. 


(b) Let T: R? — R? be multiplication by the matrix A in part (a). Find T(x) for the vector x = ( — 2, 3, 5). Using the Euclidean inner product 
on £3, verify that || T'(x) || = ||x||. 


3. Determine which of the following matrices are orthogonal. For those that are orthogonal, find the inverse. 


(а) |1 0 
0 1 
(1 1 
г ү? 
zl 8 
2 ү 
() |g 1 L 
"El 
10 0 
1 
00 — 
y2 
aj 1 1 1 
y2 ye үз 
0: lx, ММ 
ye үз 
doo. calls, „| 
v2 ye үз 
«ө 11 1 1 1 
2 2: „2-2 
da 25, "4. GP 
2 6 6 6 
d ubt dec» 
2 6 6 6 
d. Ghoum ol 
2 6 6 6 
© |1 0 0 0 
1 1 
0 = == 0 
үз 2 
1 
0 — 0 1 
ТА 
1 1 
0—— = 0 
үз 2 
Answer: 
(a) |1 0 
0 1 
(b) EN 
{2 
= ale 
V2 


SF Sale 
wi ak St 


(е) 


l 
Ale Ale ajo ple 


Ale Al Ale wl 


ala Ale Ale ple 


. Prove that if А is orthogonal, then 47 is orthogonal. 


5. Verify that the reflection matrices in Tables Table 1 and Table 2 of Section 4.9 are orthogonal. 


6. Let a rectangular x' y'-coordinate system be obtained by rotating a rectangular xy-coordinate system counterclockwise through the angle 


10. 
п. 


12. 


0 = 3514. 


(a) Find the x'y'-coordinates of ће point whose xy-coordinates are ( —2,6). 


(b) Find the xy-coordinates of the point whose x" y" -coordinates are (5, 2). 


. Repeat Exercise 6 with @ = т / 3. 


Answer: 


(а) (—1+ 373, 3 үз) 


© (2-03, 3¥3+1) 


. Let a rectangular x' y'z' coordinate system be obtained by rotating a rectangular xyz-coordinate system counterclockwise about the z-axis 


(looking down the z-axis) through the angle @ = q / 4. 


(a) Find the x'y'z'-coordinates of the point whose xyz-coordinates are ( — 1, 2, 5). 


(b) Find the xyz-coordinates of the point whose x'y'z'-coordinates are (1,6, = 3). 


. Repeat Exercise 8 for a rotation of — т / 3 counterclockwise about the y-axis (looking along the positive y-axis toward the origin). 


Answer: 

(a) (15 5 1 
| 27-3132. 3-33 

(5 [1.3 ыы 
(2 243. 6, 2-2 3 


Repeat Exercise 8 for a rotation of à = 35 / 4 counterclockwise about the x-axis (looking along the positive x-axis toward the origin). 


(a) A rectangular x" y'z'-coordinate system is obtained by rotating an xyz-coordinate system counterclockwise about the y-axis through an 


angle 0 (looking along the positive y-axis toward the origin). Find a matrix A such that 
' 


x x 
П 2 
2 


бы ; пи | 
where (x, y, z) and (x YZ ) are the coordinates of the same point in the xyz- and x! y'z'-systems, respectively. 


(b) Repeat part (a) for a rotation about the x-axis. 


Answer: 
(a) cosh 0 —sind 
А=| 0 1 0 
snf 0 созӣ 
(b) 1 0 0 
А= |0 cos# sind 
0 —< ш cos 


A rectangular x "y "z" coordinate system is obtained by first rotating a rectangular xyz-coordinate system 60° counterclockwise about the 


z-axis (looking down the positive z-axis) to obtain an x' y'z' coordinate system, and then rotating the x' y'z' -coordinate system 45? 


counterclockwise about the у'-ахіѕ (looking along the positive y'-axis toward the origin). Find a matrix A such that 
х" £ 
y"|=aly 
z" 2 


пон UH А i 
where (x, y, 2) and (x ‚У -Z ) are the xyz- and x" y"z"' coordinates of the same point. 


a+b b—a 
a=b b+a 


13. What conditions must a and b satisfy for the matrix 


to be orthogonal? 


Answer: 


2+2=5 


14. Prove that a 2 x 2 orthogonal matrix А has only one of two possible forms: 


Pe cos# —sin6 a 8 cos B sin Ө 
sn@ соѕӣ sn —cos@ 


where 0 < 0 < 2. [Hint: Start with a general 2 x 2 matrix A= (aij). and use the fact that the column vectors form an orthonormal set in 22.] 


15. (a) Use the result in Exercise 14 to prove that multiplication by a 2 x 2 orthogonal matrix is either a reflection or a reflection followed by a 
rotation about the x-axis. 


(b) Prove that multiplication by Ais a rotation if det(A) = 1 and that a reflection followed by a rotation if det(.4) = — 1. 


16. Use the result in Exercise 15 to determine whether multiplication by А is a reflection or a reflection followed by a rotation about the x-axis. 
Find the angle of rotation in either case. 


И е ШЕН 
-| ү а 
R R 
© [4a B 
ge) 2:2 
3 1 
2 2 


17. Find a, b, and c for which the matrix 


1 1 
rane m 
y2 ү 
1 1 
boo 2% 
ye ү 
1 1 
ж ш, LR 
үз ys 
is orthogonal. Are the values of a, b, and c unique? Explain. 
Answer: 
2 1 2 1 
The onl ibiliti a=0, b= ——©=,с=—= orf =], b=, c= =. 
e only possibilities are "m үз ог {6 Үз 
18. The result in Exercise 15 has an analog for 3 x 3 orthogonal matrices: It can be proved that multiplication by a 3 x 3 orthogonal matrix A is a 
rotation about some axis if det(.4) — 1 and is a rotation about some axis followed by a reflection about some coordinate plane if det(.4) — — 1 
. Determine whether multiplication by A is a rotation or a rotation followed by a reflection. 
(a) 
A=|= 


AP ala jw 
AA |ә |ә 
Уә |м 2] 


(b) 


PN 
Ш 
AIA |ә [Mo 
I 
| 
ado IN Ia 


JJno |е |ә 


19. Use the fact stated in Exercise 18 and part (b) of Theorem 7.1.2 to show that a composition of rotations can always be accomplished by a single 


rotation about some appropriate axis. 
20. Prove the equivalence of statements (a) and (c) in Theorem 7.1.1. 


21. A linear operator on g? is called rigid if it does not change the lengths of vectors, and it is called angle preserving if it does not change the 
angle between nonzero vectors. 


(a) Name two different types of linear operators that are rigid. 
(b) Name two different types of linear operators that are angle preserving. 


(с) Are there any linear operators on R? that are rigid and not angle preserving? Angle preserving and not rigid? Justify your answer. 
Answer: 


(a) Rotations about the origin, reflections about any line through the origin, and any combination of these 
(b) Rotation about the origin, dilations, contractions, reflections about lines through the origin, and combinations of these 


(c) No; dilations and contractions 
True-False Exercises 


In parts (a)-(h) determine whether the statement is true or false, and justify your answer. 


(а) 10 
The matrix | D. 1 | is orthogonal. 
0 0 
Answer: 
False 
(b) The matrix E E is orthogonal. 
Answer: 
False 


(с) Ап; x у matrix A is orthogonal if 47 4 = 7. 
Answer: 


False 


(d) A square matrix whose columns form an orthogonal set is orthogonal. 
Answer: 


False 


(e) Every orthogonal matrix is invertible. 
Answer: 


True 


(f) If A is an orthogonal matrix, then 42 is orthogonal and (det А)? =]. 


Answer: 


True 


(g) Every eigenvalue of an orthogonal matrix has absolute value 1. 


Answer: 


True 


(h) If A is a square matrix and || du|| = 1 for all unit vectors и, then А is orthogonal. 
Answer: 


True 
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7.2 Orthogonal Diagonalization 


In this section we will be concerned with the problem of diagonalizing a symmetric matrix A. As we will see, this problem is 
closely related to that of finding an orthonormal basis for R” that consists of eigenvectors of A. Problems of this type are 
important because many of the matrices that arise in applications are symmetric. 


The Orthogonal Diagonalization Problem 


In Definition 1 of Section 5.2 we defined two square matrices, A and В, to be similar if there is an invertible matrix Р such 
that Р71 Др = p. In this section we will be concerned with the special case in which it is possible to find an orthogonal 


matrix P for which this relationship holds. 


We begin with the following definition. 


DEFINITION 1 


If A and B are square matrices, then we say that A and B are orthogonally similar if there is an orthogonal matrix P 
such that PT 4P = p. 


If A is orthogonally similar to some diagonal matrix, say 
РТАР=Р 
then we say that А is orthogonally diagonalizable and that P orthogonally diagonalizes А. 
Our first goal in this section is to determine what conditions a matrix must satisfy to be orthogonally diagonalizable. As a 


first step, observe that there is no hope of orthogonally diagonalizing a matrix that is not symmetric. To see why this is so, 
suppose that 


РТАР=р (1) 


where Р is an orthogonal matrix and D is a diagonal matrix. Multiplying the left side of 1 by Р, the right side by pT, and then 
using the fact that pp? — pTp — у, we can rewrite this equation as 


А= РРР? (2) 
Now transposing both sides of this equation and using the fact that a diagonal matrix is the same as its transpose we obtain 
T T 
AT = (PDP?) = (РТ) РТРТ=РрРТ=А 


so A must be symmetric. 


Conditions for Orthogonal Diagonalizability 


The following theorem shows that every symmetric matrix is, in fact, orthogonally diagonalizable. In this theorem, and for 
the remainder of this section, orthogonal will mean orthogonal with respect to the Euclidean inner product on 2”. 


THEOREM 7.2.1 


If A is an » x » matrix, then the following are equivalent. 
(a) А is orthogonally diagonalizable. 
(b) A has an orthonormal set of n eigenvectors. 


(c) Ais symmetric. 


Proof 


(a) ^ (b) Since A is orthogonally diagonalizable, there is an orthogonal matrix P such that p 71 др is diagonal. As shown in 


the proof of Theorem 5.2.1, the n column vectors of P are eigenvectors of A. Since P is orthogonal, these column vectors are 
orthonormal, so А has n orthonormal eigenvectors. 


(b) = (а) Assume that А has an orthonormal set of n eigenvectors (pi, P2, --., ри}. As shown in the proof of Theorem 5.2.1, 
the matrix P with these eigenvectors as columns diagonalizes A. Since these eigenvectors are orthonormal, P is orthogonal 
and thus orthogonally diagonalizes А. 


(a) = (c) In the proof that (а) = (b) we showed that an orthogonally diagonalizable » x у matrix A is orthogonally 
diagonalized by an » x м matrix P whose columns form an orthonormal set of eigenvectors of A. Let D be the diagonal 
matrix 


D=P! AP 
from which it follows that 


А=РрРЇ 
Thus, 


АТ = (РРРТ\ = РОТРТ = РРР? = A 


which shows that A is symmetric. 


(c) = (a) The proof of this part is beyond the scope of this text and will be omitted. 


Properties of Symmetric Matrices 


Our next goal is to devise a procedure for orthogonally diagonalizing a symmetric matrix, but before we can do so, we need 
the following critical theorem about eigenvalues and eigenvectors of symmetric matrices. 


THEOREM 7.2.2 


If A is a symmetric matrix, then: 
(a) The eigenvalues of A are all real numbers. 


(b) Eigenvectors from different eigenspaces are orthogonal. 


Part (a), which requires results about complex vector spaces, will be discussed in Section 7.5. 


Proof (b) Let ¥1 and V2 be eigenvectors corresponding to distinct eigenvalues Ај апа Аз of the matrix А. We want to show 
that v4 - уз = 0. Our proof of this involves the trick of starting with the expression Av · уз. It follows from Formula 26 of 
Section 3.2 and the symmetry of A that 


Aviv; — vi s A Tw =v] АУ? (3) 
But V is an eigenvector of A corresponding to Aj, and v2 is an eigenvector of A corresponding to Аз, so 3 yields the 
relationship 


Аууз = ур" À2v3 


which сап be rewritten as 
(Ay = Аз) (у! * v2) = (4) 


But Ay = Аз # 0, since Ay and Аз were assumed distinct. Thus, it follows from 4 that үү - уз = 0. 


Theorem 7.2.2 yields the following procedure for orthogonally diagonalizing a symmetric matrix. 


Orthogonally Diagonalizing an n x n Symmetric Matrix 


Step 1 Find a basis for each eigenspace of A. 
Step 2 Apply the Gram-Schmidt process to each of these bases to obtain an orthonormal basis for each eigenspace. 


Step 3 Form the matrix P whose columns are the vectors constructed in Step 2. This matrix will orthogonally 
diagonalize A, and the eigenvalues on the diagonal of гу — pT др will be in the same order as their corresponding 
eigenvectors in P. 


Remark The justification of this procedure should be clear: Theorem 7.2.2 ensures that eigenvectors from different 
eigenspaces are orthogonal, and applying the Gram-Schmidt process ensures that the eigenvectors within the same 
eigenspace are orthonormal. It follows that the entire set of eigenvectors obtained by this procedure will be orthonormal. 


EXAMPLE 1 Orthogonally Diagonalizing a Symmetric Matrix + 


Find an orthogonal matrix P that diagonalizes 


Юю го 
ого го 


Solution We leave it for you to verify that the characteristic equation of A is 
А-4 -2 —2 
det(A? — A) = аё —2 A-4 -2 |= (Ao D - 8) =0 
—2 —2 А-4 


Thus, the distinct eigenvalues of A are X = 2 and \ = 8. By the method used in Example 7 of Section 5.1, it 
can be shown that 


-1 -1 
uj—| ljandu;—| 0 (5) 
0 1 


form a basis for the eigenspace corresponding to д — 2. Applying the Gram-Schmidt process to (u4, 2} 
yields the following orthonormal eigenvectors (verify): 


= 

V6 

al 

V6 (6) 
E 

v6 

The eigenspace corresponding to д — 8 has 


1 
u3—|1 
1 


as a basis. Applying the Gram-Schmidt process to (u3) (i.e., normalizing u3) yields 


m 


1 
үз 
d. 
үз 


Finally, using V1, V2, and V3 as column vectors, we obtain 


1 


Spectral Decomposition 


If A is a symmetric matrix that is orthogonally diagonalized by 


P= [щ uj ... Чу] 
and if Ay, Аз, ..., Ap are the eigenvalues of A corresponding to the unit eigenvectors uj, U3, .., uy, then we know that 
D= PT Ap. where D is a diagonal matrix with the eigenvalues in the diagonal positions. It follows from this that the matrix 
А can be expressed as 


Ay 0 0 || 
T 
А = PDPl'-|u w . а, || 0 5? MIL 
0 0 À 
n ul 
uy 
T 
= |Ayuy Agua ... Ayu, ||? 
Un 
Multiplying out, we obtain the formula 
А= щи ! Азада? Р... Ха (7) 


which is called a spectral decomposition of А. 


Note that in each term of the spectral decomposition of A has the form uu”, where u is a unit eigenvector of А in column 
form, and À is an eigenvalue of A corresponding to u. Since u has size у x 1, it follows that the product yy! has size » x у. It 
can be proved (though we will not do it) that yy? is the standard matrix for the orthogonal projection of R” on the subspace 


spanned by the vector u. Accepting this to be so, the spectral decomposition of A tells that the image of a vector x under 
multiplication by a symmetric matrix А can be obtained by projecting x orthogonally on the lines (one-dimensional 
subspaces) determined by the eigenvectors of А, then scaling those projections by the eigenvalues, and then adding the scaled 
projections. Here is an example. 


EXAMPLE 2 A Geometric Interpretation of a Spectral Decomposition <4 


[2 2 


has eigenvalues Ay = — 3 and Аз = 2 with corresponding eigenvectors 


раг 


(verify). Normalizing these basis vectors yields 


The matrix 


p 

ү5 

mal + 
5 


so a spectral decomposition of A is 


f 2 | = Xuju] + Ааш = ( — 3) 


wi 
es | 

+ 

“м 

БЕ 

— 
Tr 1 
wi 
- 

D 
| 


=P 
EE 


ES 
5 
2 -2 E 
5 
(8) 

1 2 42 

5 5 53 3 

=(-d| 54 |+®|; | 

3 3 5 5 


where, as noted above, ће 2 x 2 matrices on the right side of 8 аге the standard matrices for the orthogonal 
projections onto the eigenspaces corresponding to Àj = — 3 and Аз = 2, respectively. 


Now let us see what this spectral decomposition tells us about the image of the vector x — (1, 1) under 
multiplication by А. Writing x in column form, it follows that 


=ОШ : 


and from 8 that 


1 _2 4 2 
n gnl 5 5[[1 5 5 [1 
ж= |) EH = (9 2 4 n 21 H 
5 5 5.5 
zd 6 
5 5 
= (-3) 2 + (2) 3 (10) 
5 5 
3 
_ | 5 5| [3 
~ s|" 6 -[| 
5 


Formulas 9 and 10 provide two different ways of viewing the image of the vector (1, 1) under multiplication by 
A: Formula 9 tells us directly that the image of this vector is (3, 0), whereas Formula 10 tells us that this image 


can also be obtained by projecting (1, 1) onto the eigenspaces corresponding to А = = 3 and Аз = 2 to obtain 
12 63 i ; n -£ 12 6 
the vectors | 5 2) апа | 5'% ) then scaling by the eigenvalues to obtain | 5” 2) апа | 5.5 ) and then 


adding these vectors (see Figure 7.2.1). 


Ах = (3, 0) 


Figure 7.2.1 


The Nondiagonalizable Case 


If A is an у x » matrix that is not orthogonally diagonalizable, it may still be possible to achieve considerable simplification 
in the form of Р? 4р by choosing the orthogonal matrix P appropriately. We will consider two theorems (without proof) that 
illustrate this. The first, due to the German mathematician Isaai Schur, states that every square matrix А 15 orthogonally 
similar to an upper triangular matrix that has the eigenvalues of А on the main diagonal. 


THEOREM 7.2.3 Schur's Theorem 


If A is an у x » matrix with real entries and real eigenvalues, then there is an orthogonal matrix P such that pT Ap is 


an upper triangular matrix of the form 


М x x x 
0 А x х 

PTAP=|0 0 AM +++ x (11) 
0. 0 0 An 


in which Ay, Аз, ..., Ay are the eigenvalues of the matrix A repeated according to multiplicity. 


Issai Schur (1875—1941) 


Historical Note The life of the German mathematician Issai Schur is a sad reminder of the effect that Nazi policies 
had on Jewish intellectuals during the 1930s. Schur was a brilliant mathematician and a popular lecturer who 
attracted many students and researchers to the University of Berlin, where he worked and taught. His lectures 
sometimes attracted so many students that opera glasses were needed to see him from the back row. Schur's life 
became increasingly difficult under Nazi rule, and in April of 1933 he was forced to “retire” from the university 
under a law that prohibited non-Aryans from holding “civil service" positions. There was an outcry from many of his 
students and colleagues who respected and liked him, but it did not stave off his complete dismissal in 1935. Schur, 
who thought of himself as a loyal German never understood the persecution and humiliation he received at Nazi 
hands. He left Germany for Palestine in 1939, a broken man. Lacking in financial resources, he had to sell his 
beloved mathematics books and lived in poverty until his death in 1941. 

[Image: Courtesy Electronic Publishing Services, Inc., New York City] 


It is common to denote the upper triangular matrix in 11 by S (for Schur), in which case that equation can be rewritten as 


A= PSPT (12) 
which is called a Schur decomposition of А. 
The next theorem, due to the German mathematician and engineer Karl Hessenberg (1904—1959), states that every square 


matrix with real entries is orthogonally similar to a matrix in which each entry below the first subdiagonal is zero (Figure 
7.2.2). Such a matrix is said to be in upper Hessenberg form. 


First subdiagonal 


Figure 7.2.2 


THEOREM 7.2.4 Hessenberg's Theorem 


If A is an x x x matrix, then there is an orthogonal matrix P such that РЇ Др is a matrix of the form 


X X +++ X X X 
X х ++ X X X 
T 0 x ~. x x x 
PAP-|: |. Pod od (13) 
0 O0 x x x 
0 0 0 x x 


Note that unlike those in 11, the diagonal entries in 13 
are usually not the eigenvalues of A. 


It is common to denote the upper Hessenberg matrix in 13 by H (for Hessenberg), in which case that equation can be 
rewritten as 


A— PHPT (14) 


which is called an upper Hessenberg decomposition of A. 


Remark In many numerical algorithms the initial matrix is first converted to upper Hessenberg form to reduce the amount 
of computation in subsequent parts of the algorithm. Many computer packages have built-in commands for finding Schur and 
Hessenberg decompositions. 


Concept Review 


* Orthogonally similar matrices 


* Orthogonally diagonalizable matrix 

* Spectral decomposition (or eigenvalue decomposition) 
* Schur decomposition 

* Subdiagonal 

* Upper Hessenburg form 


* Upper Hessenburg decomposition 


Skills 

* Beable to recognize an orthogonally diagonalizable matrix. 

* Know that eigenvalues of symmetric matrices are real numbers. 

* Know that for a symmetric matrix eigenvectors from different eigenspaces are orthogonal. 
* Beable to orthogonally diagonalize a symmetric matrix. 


* Beable to find the spectral decomposition of a symmetric matrix. 


Know the statement of Schur's Theorem. 


Know the statement of Hessenburg's Theorem. 


Exercise Set 7.2 


1. Find the characteristic equation of the given symmetric matrix, and then by inspection determine the dimensions of the 
eigenspaces. 


(а) |12 
24 
(| 1-24 2 
—4 1 -2 
2 -2 —2 
(с) |1 1 1 
11 1 
11 1 
(d |4 2 2 
242 
224 
(е |}4 4 0 0 
4400 
000 0 
000 0 
(f) 2 —1 0 0 
—1 2 0 0 
0 2 —1 
0 0 —1 2 
Answer: 


(a) м2 54-0: А = 0: one-dimensional; 4 — 5: one-dimensional 


(b) 143 274 54 0: A — 6: one-dimensional; X — — 3: two-dimensional 


(c) X? — 242 — 0: A= 3: one-dimensional; А = 0: two-dimensional 
(d) A3 — 1242 + 36А — 32 — 0; А = 2: two-dimensional; д — 8: one-dimensional 
(e) \4~ 8A? — 0: А — 0: three-dimensional; 4 — 8: one-dimensional 


ix ЗАЗ + 2242 — 244 +9= 0; А = 1: two-dimensional;  — 3: two-dimensional 


In Exercises 2—9, find a matrix P that orthogonally diagonalizes A, and determine P -1 др. 


2 3 1 
‘A= 


3 6 2/3 
(a3 7 
Answer: 


B > 0 10 
ү? үт 
4 6 —2 
А= 
ERH 
5 -2 0 —36 
А= 0 —3 0 
—36 0 —23 
Answer: 
_4 603 
5 5 25 0 0 
P=| 01 0l; PaAP-| 0-3 0 
294 0 0 —50 
5 5 


8. 3100 
1300 
4-looo0 
0000 
9 =7 24 0 0 
24 7 0 0 
AS у-ү OA 
0 0 24 7 
Answer: 
43 
-5 0 0 
34 оо -25 0 0 0 
5. 29 E 0 25 0 0 
Р= ‚ РАР = 
4 3] 0 0 -25 0 
0 0 = 
5:5 о 0 о 25 
3 4 
0 0 NE 


10. Assuming that д + 0), find a matrix that orthogonally diagonalizes 


a b 
ba 
11. Prove that if А is any у; x » matrix, then 47 4 has an orthonormal set of n eigenvectors. 


12. (a) Show that if v is any џ x | matrix and Z is the » x » identity matrix, then 7 — yy? is orthogonally diagonalizable. 


(b) Find a matrix P that orthogonally diagonalizes 7 — yy? if 


13. Use the result in Exercise 19 of Section 5.1 to prove Theorem 7.2.2a for 2 x 2 symmetric matrices. 


14. Does there exist a 3 x 3 symmetric matrix with eigenvalues Ај = — 1, Ag = 3, Аз = 7 and corresponding eigenvectors 
0 1 0 
1}, 0 |, 1}? 
—1 0 1 


If so, find such a matrix; if not, explain why not. 


15. Is the converse of Theorem 7.2.25 true? Explain. 
Answer: 


No 


16. Find the spectral decomposition of each matrix. 


wf 6 —2 
-2 3 
(Q|-3 12 

1 -3 2 


17. 
18. 


19. 


20. 


2 


ақ 


0 -3 0 
—36 0 =—23 
Show that if 4 is a symmetric orthogonal matrix, then 1 and — are the only possible eigenvalues. 
(a) Find a 3 x 3 symmetric matrix whose eigenvalues are Ay = — 1, Аз = 3, Аз = 7 and for which the corresponding 
eigenvectors are v; = (0, 1, — 1), v2 = (1, 0, 0), v4 = (0, 1, 1). 
(b) Is there a 3 3 symmetric matrix with eigenvalues А = = 1, Аз = 3, Аз = 7 and corresponding eigenvectors 
vy = (0, 1, = 1), v2 = (1, 0, 0), va = (1, 1, 1)? Explain your reasoning. 
Let А be a diagonalizable matrix with the property that eigenvectors from distinct eigenvalues are orthogonal. Must А be 


symmetric? Explain you reasoning. 

Answer: 

Yes 

Prove: If (uj, u5,.., Чу} is an orthonormal basis for 2”, and if А can be expressed as 
A= cuju? ! сада F...4 Cyl, 


then A is symmetric and has eigenvalues с, c2, ..., Су. 


. In this exercise we will establish that a matrix А is orthogonally diagonalizable if and only if it is symmetric. We have 


shown that an orthogonally diagonalizable matrix is symmetric. The harder part is to prove that a symmetric matrix А is 

orthogonally diagonalizable. We will proceed in two steps: first we will show that А is diagonalizable, and then we will 

build on that result to show that А is orthogonally diagonalizable. 

(a) Assume that A is a symmetric » x » matrix. One way to prove that A is diagonalizable is to show that for each 
eigenvalue Ag the geometric multiplicity is equal to the algebraic multiplicity. For this purpose, assume that the 
geometric multiplicity of Ag 15 А, let Вр = (uj, u5,.., uy) be an orthonormal basis for the eigenspace corresponding 
to Ag, extend this to an orthonormal basis 8 = (щу, uz, ..., U,} for А”, and let P be the matrix having the vectors of 
B as columns. As shown in Exercise 34(b) of Section 5.2, the product AP can be written as 


АР Р мї X 
0 LÍ 


Use the fact that B is an orthonormal basis to prove that Х — 0 [a zero matrix of size » x (м — &)]. 
(b) It follows from part (a) and Exercise 34(c) of Section 5.2 that А has the same characteristic polynomial as 
Аі; 0 
с | tok 
0 Y 


Use this fact and Exercise 34(d) of Section 5.2 to prove that the algebraic multiplicity of Ag is the same as the 
geometric multiplicity of Ag. This establishes that A is diagonalizable. 


(c) Use Theorem 7.2.2(b) and the fact that A is diagonalizable to prove that A is orthogonally diagonalizable. 


True-False Exercises 


In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 


(a) If A is a square matrix, then 4,47 and 47 4 are orthogonally diagonalizable. 


Answer: 


True 


(b) If v1 and У2 are eigenvectors from distinct eigenspaces of a symmetric matrix, then |y, 4 vall? = ||; |? + [vail 2, 
Answer: 


True 


(c) Every orthogonal matrix is orthogonally diagonalizable. 
Answer: 


False 


(d) If A is both invertible and orthogonally diagonalizable, then 4 1 is orthogonally diagonalizable. 
Answer: 


True 


(e) Every eigenvalue of an orthogonal matrix has absolute value 1. 
Answer: 


True 


(f) If A is an y x » orthogonally diagonalizable matrix, then there exists an orthonormal basis for R” consisting of 
eigenvectors of A. 


Answer: 


False 


(g) If A is orthogonally diagonalizable, then A has real eigenvalues. 
Answer: 


True 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


7.3 Quadratic Forms 


In this section we will use matrix methods to study real-valued functions of several variables in which each term is either the 
square of a variable or the product of two variables. Such functions arise in a variety of applications, including geometry, 
vibrations of mechanical systems, statistics, and electrical engineering. 


Definition of a Quadratic Form 


Expressions of the form 

aixi Pax + * * * c dyXR 
occurred in our study of linear equations and linear systems. If a1, a2, ..., à, are treated as fixed constants, then this expression 
is a real-valued function of the n variables x1, x2, ..., x and is called a linear form on R”. All variables in a linear form occur 
to the first power and there are no products of variables. Here we will be concerned with quadratic forms on К, which are 
functions of the form 


ax? + азхі +... ахі -+ (al! possible terms архух jin which x; # xy) 


The terms of the form 4kX;X j are called cross product terms. It is common to combine the cross product terms involving *;X j 
with those involving 232+ to avoid duplication. Thus, a general quadratic form on 22 would typically be expressed as 


ajx? + azr + 2a3x1x3 (1) 
and a general quadratic form on RÊ as 
aix? | азі | азі H 24x 1X2 + 2as5x4xa + 2а6Х2Х3 (2) 


If, as usual, we do not distinguish between the number a and the 1 x | matrix [a], and if we let x be the column vector of 
variables, then 1 and 2 can be expressed in matrix form as 


a, &3][xX1 T 
EO Pd PEE 
a, ад &5][X1 
[x1 x2 x3]|a4 аз a6||x2|—x Ах 
a5 G6 23 || X3 
(verify). Note that the matrix A in these formulas is symmetric, that its diagonal entries are the coefficients of the squared terms, 


and its off-diagonal entries are half the coefficients of the cross product terms. In general, if А is a symmetric x x »; matrix and x 
is an » x ] column vector of variables, then we call the function 


T 
Q Ах) =x! Ax (3) 
the quadratic form associated with A. When convenient, 3 can be expressed in dot product notation as 


x!) Ax — x- Ax — Ax: x (4) 


In the case where A is a diagonal matrix, the quadratic form х7 4x has no cross product terms; for example, if A has diagonal 


entries Aj, Аз, ..., Ay, then 


ЫЫ. 


MO +++ 0; 


0 Ag ++: 0 x 
xlAx-[xi хах. 72. [Od Ee eine + - - А2 


0 Q +++ AS [Ln 


EXAMPLE 1 Expressing Quadratic Forms in Matrix Notation — 


In each part, express the quadratic form in the matrix notation х7 дұ, where A is symmetric. 
(a) 2х2 + бху — Sy? 


(b) af + dui - 3x? + 4x 1x9 = 2x 1x3 + 8x3x2 


Solution The diagonal entries of A are the coefficients of the squared terms, and the off-diagonal entries are half 
the coefficients of the cross product terms, so 


2 2 2 3|[x 
2x* + xy — Sy -|* dE EH 
12 —1|[^{ 


2d F 7х2 — 3x3 Б4хухо = 2x1x3 + 8x2x3 = [х1 x2 х3]| 2 7 4||Х2 
—1 4 -3||*3 


Change of Variable in a Quadratic Form 


There are three important kinds of problems that occur in applications of quadratic forms: 


Problem 1 If x7 дұ is a quadratic form on 22 or 23, what kind of curve or surface is represented by the equation 

x? Ax =k? 

Problem 2 If x7 ду is a quadratic form on R”, what conditions must A satisfy for x7 4x to have positive values for 
xz? 

Problem 3 If x? ду is a quadratic form on R”, what are its maximum and minimum values if x is constrained to satisfy 
Ix] = 1? 


We will consider the first two problems in this section and the third problem in the next section. 


Many of the techniques for solving these problems are based on simplifying the quadratic form y7 4y by making a substitution 


x— Py (5) 


that expresses the variables x, x3, ..., Ху in terms of new variables у, уз, ..., Yy If P is invertible, then we call 5 a change of 
variable, and if P is orthogonal, then we call 5 an orthogonal change of variable. 


If we make the change of variable x = Py in the quadratic form х? 4x, then we obtain 
T T ToT TipT 
x! Ax = (Ру) A[Py] =y PT APy =y" (P' AP y (6) 


Since the matrix p — P T AP is symmetric (verify), the effect of the change of variable is to produce a new quadratic form y By 
in the variables у, уз, ..., уу: In particular, if we choose P to orthogonally diagonalize A, then the new quadratic form will be 
уру, where D is a diagonal matrix with the eigenvalues of А on the main diagonal; that is, 


0 Az +». 0 у2 
x’ Ax=y"Dy =[y1 y2 °° ун] d n = |" 
0 0 А, [Уи 
=M + у КА 


Thus, we have the following result, called the principal axes theorem. 


THEOREM 7.3.1 The Principal Axes Theorem 


If A is a symmetric у x у matrix, then there is an orthogonal change of variable that transforms the quadratic form x 7 Ах 
into a quadratic form уру with no cross product terms. Specifically, if P orthogonally diagonalizes A, then making the 


change of variable x = Ру in the quadratic form y 7 4x yields the quadratic form 


2 
x Ax - y Dy = Ауу | №у? Fett Ауу 


in which Ду, Аз, ..., Ay аге the eigenvalues of A corresponding to the eigenvectors that form the successive columns of 
P. 


EXAMPLE 2 Ап Illustration of the Principal Axes Theorem + 


Find an orthogonal change of variable that eliminates the cross product terms in the quadratic form 
Q= x? B xå — 4x1x2 + 4x 2x3, and express О in terms of the new variables. 


Solution The quadratic form can be expressed in matrix notation as 


1—2 0||^1 
Q-oxlax-|xi х2 хз||—2 0 2||х2 
0 2 —1|]^3 
The characteristic equation of the matrix A is 
А-1 2 0 
2 А —2 |=л%#—9л=ҖМА+3)(А—3)=0 
0 —2 А+1 
so the eigenvalues are а — 0, —3, 3. We leave it for you to show that orthonormal bases for the three eigenspaces 
are 
2 =L _2 
3 3 5 
0:11 = -3 |2 — 2 
A=0:/ 3), A= 3: 2| А=3 3 
2 2 1 
3 3 3 
Thus, a substitution x = Py that eliminates the cross product terms is 
2 1 2 
xi 3 3 3 y1 
x;|2.|i -2 2 |у 
хз 3 3 3 уз 
2 2 1 
3 E 3 


This produces the new quadratic form 


0 0 071 
Q-y (P'APy- pi y2 ys]|o -3 0||>2|= —3y2 + 32 
0 0 3]||J3 


in which there are no cross product terms. 


Remark IfA is a symmetric у x »; matrix, then the quadratic form y7 4y is a real-valued function whose range is the set of all 
possible values for x 7 4 as x varies over R”. It can be shown that an orthogonal change of variable x = Py does not alter the 
range of a quadratic form; that is, the set of all values for x7 4y as x varies over 2” is the same as the set of all values for 


y" (P T AP у as y varies over R”. 


Quadratic Forms in Geometry 


Recall that a conic section or conic is a curve that results by cutting a double-napped cone with a plane (Figure 7.3.1). The most 
important conic sections are ellipses, hyperbolas, and parabolas, which result when the cutting plane does not pass through the 
vertex. Circles are special cases of ellipses that result when the cutting plane is perpendicular to the axis of symmetry of the 
cone. If the cutting plane passes through the vertex, then the resulting intersection is called a degenerate conic. The possibilities 
are a point, a pair of intersecting lines, or a single line. 


| PELA | 
ч 229 — E | 


| Circle | Ellipse Parabola Hyperbola 


Figure 7.3.1 


y 


A central conic 
rotated out of 
standard position 


Figure 7.3.2 


Quadratic forms in 22 arise naturally in the study of conic sections. For example, it is shown in analytic geometry that an 


equation of the form 
2 2 = 
ах“ + 2bxy + cy^ -- dx +еу + ў = 0 (7) 


in which a, b, and c are not all zero, represents a conic section. If — г = Q in 7, then there are no linear terms, so the equation 
becomes 


ax? + 2bxy -- cy? + f =0 (8) 


and is said to represent a central conic. These include circles, ellipses, and hyperbolas, but not parabolas. Furthermore, if  — 0 
in 8, then there is no cross product term (i.e., term involving xy), and the equation 


ax? + cy? 4-0 (9) 


18 said to represent a central conic in standard position. The most important conics of this type are shown in Table 1. 


Table 1 


2 | 3 э > > E] ^ 


2.27. La р Lo ae | L.X. 
| a? В? а? В? а? p p а? 
| (>В>0) | (B » a »0) (a » 0, B » 0) (a > 0, B > 0) 
If we take the constant fin Equations 8 and 9 to the right side and let & = — 7 , then we can rewrite these equations in matrix 
form as 
а b|[x a olx 
x у =k and [X У =k 
r$ [b] ESI (10 


The first of these corresponds to Equation 8 in which there is a cross product term 2bxy, and the second corresponds to Equation 
9 in which there is no cross product term. Geometrically, the existence of a cross product term signals that the graph of the 
quadratic form is rotated about the origin, as in Figure 7.3.2. The three-dimensional analogs of the equations in 10 are 


a d е | a 0 Ojfx 
[x > 2]|4 & У |у |= and [х » 2]|0 è O|||—k (11) 
2 Ў с 2 0 0 L^ 2 


If a, b, and c are not all zero, then the graphs of these equations іп д? are called central quadrics in standard position. 


Identifying Conic Sections 


We are now ready to consider the first of the three problems posed earlier, identifying the curve or surface represented by an 
equation x7 дұ — ķ in two or three variables. We will focus on the two-variable case. We noted above that an equation of the 


form 


ax? + 2bxy +ey + f =0 (12) 


represents a central conic. If b — 0, then the conic is in standard position, and if b x Q, it is rotated. It is an easy matter to 
identify central conics in standard position by matching the equation with one of the standard forms. For example, the equation 


9х2 + 16y? — 144 —0 


can be rewritten as 


Figure 7.3.3 


Ifa central conic is rotated out of standard position, then it can be identified by first rotating the coordinate axes to put it in 
standard position and then matching the resulting equation with one of the standard forms in Table 1. To find a rotation that 
eliminates the cross product term in the equation 


ax? + 2bxy -- cy! =k (13) 
it will be convenient to express the equation in the matrix form 


ya: alk bl- (4) 


and look for a change of variable 


x= Рх’ 
that diagonalizes А and for which det(?) = 1. Since we saw in Example 4 of Section 7.1 that the transition matrix 
0 -—sn 
р | 
| snf cosé (15) 


has the effect of rotating the xy-axes of a rectangular coordinate system through an angle 0, our problem reduces to finding 0 that 
diagonalizes A, thereby eliminating the cross product term in 13. If we make this change of variable, then in the x' y'-coordinate 


p Aj 0 || х 
x' pé- АЕ МИБ (16) 


where Ду and Аз аге the eigenvalues of A.The сопіс can now be identified by writing 16 in the form 


system, Equation 14 will become 


Aix + №у2=& (17) 


and performing the necessary algebra to match it with one of the standard forms in Table 1. For example, if A4, Аз, and К are 
positive, then 17 represents an ellipse with an axis of length 2 V ki A, in the x-direction and 2 V k / Aj in the y'-direction. The 


first column vector of P, which is a unit eigenvector corresponding to Ду, is along the positive x'-axis; and the second column 
vector of P, which is a unit eigenvector corresponding to Аз, is a unit vector along the y'-axis. These are called the principal 


axes of the ellipse, which explains why Theorem 7.3.1 is called “the principal axes theorem.” (See Figure 7.3.4.) 


Unit eigenvector for A, 


(-sin Ө, cos 0) cos 0. sin 0) 


Figure 7.3.4 


EXAMPLE 3 Identifying a Conic by Eliminating the Cross Product Term + 


(а) Identify the conic whose equation is 5x7 — 4ху + gy? — 36 = 0 by rotating the xy-axes to put the conic in 
standard position. 
(b) Find the angle 0 through which you rotated the xy-axes in part (a). 


Solution 
(a) The given equation can be written in the matrix form 
x! Ах = 36 
where 
5 —2 
A= 
The characteristic polynomial of A is 
А-5 2 
= (А—4)(А—9 
| 7 |= 0-00-9) 
so the eigenvalues аге д — 4 and д — 9. We leave it for you to show that orthonormal bases for the eigenspaces 
are 
-à. -l 
{5 (5 
A=4 ‚ А=9 
l 2 
V5 


Thus, A is orthogonally diagonalized by 
-2. 
y5 
r= ү (18) 
V5 


Had it turned out that det(.P) = — 1, then we 
would have interchanged the columns to reverse the 


sign. 


Moreover, it happens by chance that det(.P) = 1, so we are assured that the substitution x — Py performs а 
rotation of axes. It follows from 16 that the equation of the conic in the x' y'-coordinate system is 


which we can write as 
2 2 
4x5 4-97 = 36 ог 7+ га =1 


We can now see from Table 1 that the conic is an ellipse whose axis has length 24 — 6 in the x'-direction and 
length 24 = 4 in the y'-direction. 


(b) It follows from 15 that 


2 l 
P= (5 ү5 zb je 
l1 2. snf cos@ 
y5 5 
which implies that 
tang = sin _1 


202 ar 1 
did" c sem ттт С? 


Thus, = tall 226.6 (Figure 7.3.5) 


Figure 7.3.5 


Remark In the exercises we will ask you to show that if 5 2 0, then the cross product term in the equation 
2 2. 
ax” + 2bxy -Fcy* =k 
can be eliminated by a rotation through an angle 0 that satisfies 


a —c 
cot 20 — 2b (19) 


We leave it for you to confirm that this is consistent with part (b) of the last example. 


Positive Definite Quadratic Forms 


We will now consider the second of the two problems posed earlier, determining conditions under which x? 4x > 0) for all 
nonzero values of x. We will explain why this is important shortly, but first we introduce some terminology. 


The terminology in Definition 1 also applies to the 
matrix A; that is, A is positive definite, negative definite, 
or indefinite in accordance with whether the associated 
quadratic form has that property. 


DEFINITION 1 


A quadratic form х7 Ду is said to be 
positive definite if x7 4x ~ 0 for x « Q 
negative definite if x7 Ay < 0 forx +0 


indefinite if 47 ду has both positive and negative values 
x' Ax р g 


The following theorem, whose proof is deferred to the end of the section, provides a way of using eigenvalues to determine 
whether a matrix A and its associated quadratic form x7 4x are positive definite, negative definite, or indefinite. 


THEOREM 7.3.2 


If A is a symmetric matrix, then: 


(a) x7 Ax is positive definite if and only if all eigenvalues of A are positive. 
(b) x7 Ax is negative definite if and only if all eigenvalues of A are negative. 


(c) x7 Ax is indefinite if and only if А has at least one positive eigenvalue and at least one negative eigenvalue. 


Remark The three classifications in Definition 1 do not exhaust all of the possibilities. For example, a quadratic form for 
which xl. Ax > 0 If x # 0 is called positive semidefinite, and one for which х? Ax < 0 if x # 0 is called negative semidefinite. 
Every positive definite form is positive semidefinite, but not conversely, and every negative definite form is negative 
semidefinite, but not conversely (why?). By adjusting the proof of Theorem 7.3.2 appropriately, one can prove that х7 4x is 
positive semidefinite if and only if all eigenvalues of A are nonnegative and is negative semidefinite if and only if all 
eigenvalues of А are nonpositive. 


EXAMPLE 4 Positive Definite Quadratic Forms + 


It is not usually possible to tell from the signs of the entries in a symmetric matrix A whether that matrix is 
positive definite, negative definite, or indefinite. For example, the entries of the matrix 


3 1 1 
А=|1 0 2 
120 


are nonnegative, but the matrix is indefinite since its eigenvalues аге Д — 1, 4, —2 (verify). To see this another 
way, let us write out the quadratic form as 
3 1 14/%1 
xlAx—|xi x3 хз||1 0 2||х2 = 3х} H 2х1х2+Е 2x4x3 + 4х2Х3 
1 2 0||^З 


Positive definite and negative definite matrices 
are invertible. Why? 


We can now see, for example, that 
x! Ax —4 for xj—0, x3—1, x3=1 
and 


xl4x— —4 for xy=0, x3—1l, хз= =] 


Classifying Conic Sections Using Eigenvalues 


If x7 By — is the equation of a conic, and if ¢ + 0), then we can divide through by К and rewrite the equation in the form 


x? Ax=1 (20) 


where A= (1 / &) 5. If we now rotate the coordinate axes to eliminate the cross product term (if any) in this equation, then the 
equation of the conic in the new coordinate system will be of the form 


Aix + №у2 = 1 (21) 


in which A; and Аз are the eigenvalues of A. The particular type of conic represented by this equation will depend on the signs 
of the eigenvalues Ај and Аз. For example, you should be able to see from 21 that: 


* xI Ax = 1 represents an ellipse if \y > 0 and Аз > 0. 
* x7 Ax — 1 has no graph if Ay = 0 and Az <= 0. 
* x7 Ах = 1 represents a hyperbola if Ay and Аз have opposite signs. 


In the case of the ellipse, Equation 21 can be rewritten as 


RE y? 


———— + 


а] 
so the axes of ће ellipse have lengths 2 / "EVI and 2 } "EVI (Figure 7.3.6). 


y 


Figure 7.3.6 


The following theorem is an immediate consequence of this discussion and Theorem 7.3.2. 


THEOREM 7.3.3 


If A is a symmetric 2 x 2 matrix, then: 

(a) x7 Ax — 1 represents an ellipse if А is positive definite. 
(b) x7 Ax — 1 has no graph if A is negative definite. 

(c) xT Ax = 1 represents a hyperbola if А is indefinite. 


In Example we performed a rotation to show that the equation 
5x? — 4ху + 8y? — 36 =0 


represents an ellipse with a major axis of length 6 and a minor axis of length 4. This conclusion can also be obtained by 
rewriting the equation in the form 


Sx dug 
36% 7 S" + 9 1 
and showing that the associated matrix 
2 Lb 
36 18 
A= 
д 2 
18 9 


has eigenvalues Ау = 1 and Az = 1 These eigenvalues are positive, so the matrix А is positive definite and the equation 
g 1= 9 2= 2 5 р р q 


represents an ellipse. Moreover, it follows from 21 that the axes of the ellipse have lengths 2 / "E — 6and2/ "EVI — 4, which 
18 consistent with Example 3. 


Identifying Positive Definite Matrices 


Positive definite matrices are the most important symmetric matrices in applications, so it will be useful to learn a little more 
about them. We already know that a symmetric matrix is positive definite if and only if its eigenvalues are all positive; now we 
will give a criterion that can be used to determine whether a symmetric matrix is positive definite without finding the 
eigenvalues. For this purpose we define the kth principal submatrix of an » x у matrix A to be the & x ic submatrix consisting of 
the first k rows and columns of A. For example, here are the principal submatrices of a general 4 x 4 matrix: 


211 212 213 14 011 212 213 014 211 212 213 14 011 212 213 14 
221 422 423 224 221 422 223 224 021 222 423 224 221 422 423 224 
431 232 433 434 031 232 433 234 031 432 433 234 431 232 433 434 


ад ад ад а ад ад ад а ад ад ад а аар AA 43 алд 
= 


44 44 44 
First principal submatrix| | Second principal submatrix | | Third principal submatrix | | Fourth principal submatrix 


The following theorem, which we state without proof, provides a determinant test for ascertaining whether a symmetric matrix is 
positive definite. 


THEOREM 7.3.4 


A symmetric matrix А is positive definite if and only if the determinant of every principal submatrix is positive. 


EXAMPLE 5 Working with Principal Submatrices + 


The matrix 


2 =] -3 
A=/-1 2 4 
-3 4 9 
is positive definite since the determinants 
2 =] -3 
2|= 2, i PE -1 2 4|=1 
-3 4 9 


are all positive. Thus, we are guaranteed that all eigenvalues of A are positive and x7 Ду > 0 for x #0. 


OPTIONAL 


We conclude this section with an optional proof of Theorem 7.3.2. 


Proofs of Theorem 7.3.2(a) and (b) It follows from the principal axes theorem (Theorem 7.3.1) that there is an orthogonal 
change of variable x = Py for which 


x! Ax - y! Dy - Ay] + Ау? + ...+ Муў (23) 


where the № are the eigenvalues of A. Moreover, it follows from the invertibility of P that у #0 if and only if x +Q, so the 
values of x7 4x for x # Q are the same as the values of уру fory #0. Thus, it follows from 23 that х7 4x > 0) for x # 0 if and 


only if all of the A's in that equation are positive, and that y7 4х = () for x # 0 if and only if all of the A's are negative. This 
proves parts (a) and (b). 


Proof (c) Assume that A has at least one positive eigenvalue and at least one negative eigenvalue, and to be specific, suppose 
that Ay > О and Аз < 0 in 23. Then 


xT Ax>0 if y = 1 and all other y's are 0 
and 

x Ax >0 if уз = 1 and all other y's are 0 
which proves that x 7 ду is indefinite. Conversely, if x7 4y => 0) for some x, then y! Dy > 0 for some y, so at least one of the A's 
in 23 must be positive. Similarly, if x7 4y = Q for some x, then y! Dy < 0 for some y, so at least one of the A's in 23 must be 


negative, which completes the proof. 


Concept Review 

* Linear form 

* Quadratic form 

* Cross product term 

* Quadratic form associated with a matrix 
* Change of variable 

* Orthogonal change of variable 

* Principal Axes Theorem 


* Conic section 


* Degenerate conic 

* Central conic 

* Standard position of a central conic 

* Standard form of a central conic 

* Central quadric 

* Principal axes of an ellipse 

* Positive definite quadratic form 

* Negative definite quadratic form 

* Indefinite quadratic form 

* Positive semidefinite quadratic form 
* Negative semidefinite quadratic form 
* Principal submatrix 

Skills 

* Express a quadratic form in the matrix notation x7 4y, where А is a symmetric matrix. 


* Find an orthogonal change of variable that eliminates the cross product terms in a quadratic form, and express the 
quadratic form in terms of the new variable. 


* Identify a conic section from an equation by rotating axes to place the conic in standard position, and find the angle of 
rotation. 


* [dentify a conic section using eigenvalues. 


* Classify matrices and quadratic forms as positive definite, negative definite, indefinite, positive semidefinite or 
negative semidefinite. 


Exercise Set 7.3 


In Exercises 1—2, express the quadratic form in the matrix notation x7 4x, where A is a symmetric matrix. 


L (а) 3x2 + 7х2 
(b) 4x? - 9х2 = 6хух2 


(с) 9x? – х2 + 4х2 + бхухэ = 8хх3 + x2x3 


Answer: 


(c) 9 3 -—4 
_1 11/7! 
[x1 x2 x3] 2||*2 
1 х3 

-4 2 4 


2. (а) ax? + 5x 4x2 
(b) —7x1x2 


(с) ae + ха - 32 = 5x1x2 + 9xi1x3 


In Exercises 3-4, find a formula for the quadratic form that does not use matrices. 


‘PS SI 


Answer: 


2x? + 5y? = бху 


4 7 
2 2 : ХІ 
[х1 x2 x3]| 7 о 6||^2 
2 х3 
163 


In Exercises 5—8, find an orthogonal change of variables that eliminates the cross product terms in the quadratic form О, and 
express Q in terms of the new variables. 


5.0= 2x? + 2x2 — 2x1x2 
Answer: 
3 — 2,2 2 
xj A Q-—3yi»2 


1 
ay] |, 
2] 


6.0— 5x? + 2х2 + 4х2 + 4х1х92 
7.0 = 3x? + 4x3 + 5х2 + 4х9 = 4х9х3 


Answer: 
mE 25 1 
xi 3 3 3 yı 
х2 |= 2 i 2 ¥2 |; Q=y? + 4y3 +72 
x 
3 1 2 E J3 
3 3 3 


8.0— 2х} + 5х2 + 5х2 + 4x 1x9 — Ахухз = 8x3x3 


In Exercises 9—10, express the quadratic equation in the matrix form x! Ax + Kx + Jf = 0, where x7 Ax is the associated 


quadratic form and K is an appropriate matrix. 


9- (a) 2x? + xy --x — 6y 4-2—0 
(b y?--7x 8y —5=0 


Answer: 


(a) 2 


© nml 


[;]+1-16][;]+2= 


voie S] enr] 


10. (а) x? xy + 5х 4+ 8y —3—0 
(b) 5ху=8 


In Exercises 11—12, identify the conic section represented by the equation. 


П. (а) 2x? 4. 5y? = 20 
(6) х2-у2-8=0 
(с) Ty? 2х 20 
(d) x? -- y? 22520 


Answer: 


(a) ellipse 
(b) hyperbola 
(c) parabola 
(d) circle 
12. (а) 4х2 4.9? =1 
(b) 4x? — 5y? = 20 
(с) =x? = 2y 
(d х2-3= =y? 


In Exercises 13—16, identify the conic section represented by the equation by rotating axes to place the conic in standard 
position. Find an equation of the conic in the rotated coordinates, and find the angle of rotation. 


13. 2x? 4ху 2 y? +8 =0 
Answer: 
; n2 n2 ° 
Hyperbola: 2(y")" — 3(х') 28; В —26.6 
14. 5x? + 4ху + 5y? =9 
15. 11x? + 24ху + 4y? — 15 —0 
Answer: 


Hyperbola: 4(x")? — (y)? = 3; 8 = 36.9 


16. x? 4 ху tyes 


In Exercises 17—18, determine by inspection whether the matrix is positive definite, negative definite, indefinite, positive 
semidefinite, or negative semidefinite. 


17. (а) 1 0 
0 2 


[-1 0 
0 —2 


(à) [1 0 
0 0 

()]0 0 
0 —2 

Answer: 


(a) Positive definite 

(b) Negative definite 

(c) Indefinite 

(d) Positive semidefinite 


(e) Negative semidefinite 


18.a) [2 0 
0 —5 
(6) | =2 0 
0 —5 
(с) |2 0 
0 5 
(d 10 0 
0 —5 
(е) |2 0 
0 0 
In Exercise 19—24, classify the quadratic form as positive definite, negative definite, indefinite, positive semidefinite, or 
negative semidefinite. 
19. xt | х2 
Answer: 


Positive definite 
20. =x? — 3x 
21. (x1 = x2)? 
Answer: 


Positive semidefinite 
22. (x4 = x3)? 
23. x? х2 

Answer: 


Indefinite 
24,*1%2 


In Exercises 25—26, show that the matrix A is positive definite first by using Theorem 7.3.2 and second by using Theorem 
7.3.4. 


25. 


(b) 3-1 0 
A-|-1 2 -1 
0=1 3 


In Exercises 27—28, find all values of k for which the quadratic form is positive definite. 
27.532.524 x2. _ _ 
(Bx + х2 + х3 + 4x 1x9 — 2x 1x3 — 2x9x3 


Answer: 


k>2 
2 2 2 
28. 3x1 + x5 + 2x3 — 2хх3 + 2kx2xa 


29. | et x7 ду be a quadratic form in the variables x, x3, ..., x, and define T: R” — R by T(x] =x! Ax, 
(а) Show that T(x + у) = T(x} + 2x! Ay + ту). 
(b) Show that T(ex) == T(x] 
30. Express the quadratic form (суху + сэхэ ++... суху) 2 in the matrix notation x7 4x, where A is symmetric. 
31. In statistics, the quantities 
x= il +x t.. t d 


and 


Гоа) а) н] 


are called, respectively, the sample mean and sample variance of x = (x1, X3, ..., Ху). 


(a) Express the quadratic form s2 in the matrix notation x7 4x, where A is symmetric. 


(b) Is st a positive definite quadratic form? Explain. 


Answer: 
(a) 1 zal ME ee е 
aín = 1) n(x = 1) 
EE: IER 1 snas аш с> 
A-| з(л—1) п n(n—1) 
C RNC ы. Шш 
n(»—1) n(»—1) n 
(b) Yes 


32. The graph in an xyz-coordinate system of an equation of form ax? + by? -+ ez? = 1 in which a, b, and c are positive is a 


surface called a central ellipsoid in standard position (see the accompanying figure). This is the three-dimensional 
generalization of the ellipse ax? + by? = 1 in the xy-plane. The intersections of the ellipsoid ax? + by? -+ ez? = 1 with the 


33. 


34. 


35. 


coordinate axes determine three line segments called the axes of the ellipsoid. If a central ellipsoid is rotated about the origin 
So two or more of its axes do not coincide with any of the coordinate axes, then the resulting equation will have one or more 
cross product terms. 

(a) Show that the equation 


42445244214 4,4 


T a Ду; ] 
3 3 F37 F2 + зх + yz 

represents ап ellipsoid, and find the lengths of its axes. [Suggestion: Write the equation in the form x7 Ду — 1 and make 

an orthogonal change of variable to eliminate the cross product terms. 


(b) What property must a symmetric 3 x 3 matrix have in order for the equation x7 4x = 1 to represent an ellipsoid? 


Figure Ex-32 


What property must a symmetric 2 x 2 matrix A have for x7 Ах = 1 to represent a circle? 
Answer: 


A must have a positive eigenvalue of multiplicity 2. 
Prove: If b = 0, then the cross product term can be eliminated from the quadratic form ах? + 2®ху + cy? by rotating the 
coordinate axes through an angle 0 that satisfies the equation 


cot 20 = 2—6. 


2b 
Prove that if А is an » x y symmetric matrix all of whose eigenvalues are nonnegative, then х? Ax > 0 for all nonzero x in 


А". 


True-False Exercises 


In parts (a)-(l) determine whether the statement is true or false, and justify your answer. 


(a) A symmetric matrix with positive definite eigenvalues is positive definite. 


Answer: 


True 


(b) x? © xi + x2 + 4хүхэхз is a quadratic form. 


Answer: 


False 


(€) (ху = 3х2)? is a quadratic form. 


Answer: 


True 


(d) A positive definite matrix is invertible. 


Answer: 


True 


(e) A symmetric matrix is either positive definite, negative definite, or indefinite. 
Answer: 


False 


(f) If A 15 positive definite, then — 4 is negative definite. 
Answer: 


True 


(о) x - x is a quadratic form for all x in R”. 
Answer: 


True 


(h) If x7 4x is a positive definite quadratic form, then so is x7 4 1x. 
Answer: 


True 


(i) If A is a matrix with only positive eigenvalues, then x7 4x is a positive definite quadratic form. 
Answer: 


False 


(j) If A is a 2 x 2 symmetric matrix with positive entries and det{ A) > 0, then А is positive definite. 
Answer: 


True 


(k) If x7 4x is a quadratic form with no cross product terms, then A is a diagonal matrix. 
Answer: 


False 


(I) If 7 4x is a positive definitequadratic form in two variables and с + (), then the graph of the equation x7 4x = с is an ellipse. 
Answer: 


False 
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7.4 Optimization Using Quadratic Forms 


Quadratic forms arise in various problems in which the maximum or minimum value of some quantity is required. 
In this section we will discuss some problems of this type. 


Constrained Extremum Problems 


Our first goal in this section is to consider the problem of finding the maximum and minimum values of a 
quadratic form y 7 Ау. subject of the constraint ||х|| = 1. Problems of this type arise in a wide variety of 


applications. 


To visualize this problem geometrically in the case where y 7 4x is a quadratic form on 22, view z — x 7 Ax as the 
equation of some surface in a rectangular xyz-coordinate system and view ||x|| = 1 as the unit circle centered at 
the origin of the xy-plane. Geometrically, the problem of finding the maximum and minimum values of y 7 4x 
subject to the requirement ||x|| = 1 amounts to finding the highest and lowest points on the intersection of the 
surface with the right circular cylinder determined by the circle (Figure 7.4.1). 


z Constrained 
maximum 


Constrained 
minimum 


Unit circle 


Figure 7.4.1 


The following theorem, whose proof is deferred to the end of the section, is the key result for solving problems of 
this type. 


THEOREM 7.4.1 Constrained Extremum Theorem 


Let A be a symmetric » x p matrix whose eigenvalues in order of decreasing size are 

Ay > Ag > +++ 2 Ap Then: 

(a) the quadratic form x 7 4x attains a maximum value and a minimum value on the set of vectors for 
which ||x|| = 1; 

(b) the maximum value attained in part (a) occurs at a unit vector corresponding to the eigenvalue Aj; 


(c) the minimum value attained in part (a) occurs at a unit vector corresponding to the eigenvalue Ay. 


Remark The condition ||x|| = 1 in this theorem is called a constraint, and the maximum or minimum value of 


xT Ах subject to the constraint is called a constrained extremum. This constraint can also be expressed as 


xx= 1 or as S -} xi TERES x2 — 1, when convenient. 


EXAMPLE 1 Finding Constrained Extrema + 


Find the maximum and minimum values of the quadratic form 
z— 5х2 4 5у? H Axy 
subject to the constraint x? + y? = 1. 


Solution The quadratic form can be expressed in matrix notation as 
2 2 T 5 2 [х 
= 5x" + 5у +4ху = х Ах= |х у 
2 x y XY 5X | | | 2 ; | 2| 


We leave it for you to show that the eigenvalues of А are Ау = 7 and Аз = 3 and that corresponding 


eigenvectors are 
1 —1 
Ay = 7: Аз = 3: 


Normalizing these eigenvectors yields 
1 d 
y2 ү2 

М=?: ail М =3: a (1) 
V2 


Thus, the constrained extrema are 


| А 1 
constrained maximum:z = 7 at(x, у) = |—=, | 
(2 ү2 


Remark Since the negatives of the eigenvectors in 1 are also unit eigenvectors, they too produce the maximum 
and minimum values of z; that is, the constrained maximum z — 7 also occurs at the point 


1 1 ; ho 1 1 
(х,у) =| = =, = —= | and the constrained minimum 7 = 3 at (X, y) = |—=, ——= |. 
2 ү вы 2 ү 


EXAMPLE 2 AConstrained Extremum Problem + 


A rectangle is to be inscribed in the ellipse 4x? + 9у? = 36, as shown in Figure 7.4.2.Use 


eigenvalue methods to find nonnegative values of x and y that produce the inscribed rectangle with 
maximum area. 


Figure 7.4.2. A rectangle inscribed in the ellipse 4х2 + 9y? = 36. 


Solution The area z of the inscribed rectangle is given by z = 4xy, so the problem is to maximize 
the quadratic form z = 4xy subject to the constraint 4х? + 9y? — 36. In this problem, the graph of 


the constraint equation is an ellipse rather than the unit circle as required in Theorem 7.4.1, but we 
can remedy this problem by rewriting the constraint as 


808) =. 
and defining new variables, х] and y, by the equations 
х= хү and y —2y, 
This enables us to reformulate the problem as follows: 
maximize 2 = 4ху = 24x171 
subject to the constraint 
xt + y? =] 


To solve this problem, we will write the quadratic form z = 24x y as 


sl eee 0 12]||*1 
z=% Ax ЕЛЕ чү? 


We now leave it for you to show that the largest eigenvalue of A is Д — 12 and that the only 
corresponding unit eigenvector with nonnegative entries is 


ES 
=; 
V2 


Thus, the maximum area is z — 12, and this occurs when 


ЭЕ апа y= dy =a 


Constrained Extrema and Level Curves 


A useful way of visualizing the behavior of a function ў (x, y) of two variables is to consider the curves in the 
xy-plane along which 7? (x, у) is constant. These curves have equations of the form 


fix, у) =k 


and are called the level curves of f (Figure 7.4.3).In particular, the level curves of a quadratic form y 7 Ду on 22 


have equations of the form 
xl Ax=k (2) 


so the maximum and minimum values of x 7 Ах subject to the constraint |х| = 1 are the largest and smallest 
values of k for which the graph of 2 intersects the unit circle. Typically, such values of k produce level curves that 
just touch the unit circle (Figure 7.4.4), and the coordinates of the points where the level curves just touch produce 
the vectors that maximize or minimize x7 4y subject to the constraint ||х|| = 1. 


x Level curve f(x, y)=k 


Figure 7.4.3 


Figure 7.4.4 


EXAMPLE 3 Example 1 Revisited Using Level Curves + 


In Example 1 (and its following remark) we found the maximum and minimum values of the 
quadratic form 


z— 5x24 5y? H Axy 
subject to the constraint x? + y? — ]. We showed that the constrained maximum is z — 7, and this is 


attained at the points 


and that the constrained minimum z — 3, and this is attained at the points 


1 1 


о) e во e 


Geometrically, this means that the level curve 5x? + 5y? + 4ху = 7 should just touch the unit 
circle at the points in 3, and the level curve 5x? 4 5у? + 4xy = 3 should just touch it at the points 
in 4. All of this is consistent with Figure 7.4.5. 


y 


У 1. 23 
P duis cdi ыы 


22.22 
5x* + Sy* + 4ху = 3 


Figure 7.4.5 


CALCULUS REQUIRED 
Relative Extrema of Functions of Two Variables 


We will conclude this section by showing how quadratic forms can be used to study characteristics of real-valued 
functions of two variables. 


Recall that if a function f, у) has first-order partial derivatives, then its relative maxima and minima, if any, 
occur at points where 


J xx. y)=0 and Jy y)20 


These are called critical points of f. The specific behavior of fat a critical point (хү, yg) is determined by the sign 
of 


D(x, y) = f(x, y) – (х0, Yo) (5) 


at points (х, y) that are close to, but different from, (xg, yg): 


* If D(x, y) > 0 at points (x, у) that are sufficiently close to, but different from, (x9, yg). then 
f (xg, yo) < f (x, y) at such points and fis said to have a relative minimum at (х, yg) (Figure 7.4.6a). 


* If D(x, y) < 0 at points (x, у) that are sufficiently close to, but different from, (xg, yg), then 
f (xg, уй) > f (x, y) at such points and fis said to have a relative maximum at (хү, yg) (Figure 7.4.6b). 
* If D(x, y) has both positive and negative values inside every circle centered at (хү, y), then there are points 


(x, y) that are arbitrarily close to (xg, yg) at which f (xg, yg) < f(x, y) and points (x, y) that are 
arbitrarily close to (xg, yg) at which # (xp, yg) > f(x, y). In this case we say that f has a saddle point at 


(xg, уй) (Figure 7.4.6c). 
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Figure 7.4.6 


In general, it can be difficult to determine the sign of 5 directly. However, the following theorem, which is proved 
in calculus, makes it possible to analyze critical points using derivatives. 


THEOREM 7.4.2 Second Derivative Test 


Suppose that (xo, уп) is a critical point of f (x, у) and that f has continuous second-order partial 
derivatives in some circular region centered at (xg, yg). Then: 


(a) fhas a relative minimum at (xg, yg) if 
Ў хх(х0, УО) py (xo, Yo) -fiy (хо, Yo) —- 0 and у(х, yo) >9 
(b) fhas a relative maximum at (xg, yg) if 
Ў хх(хо, VOUS py (xo, Уй) -73 (хо, yo) >0 and Jfy(xo, yo) <9 
(c) fhas a saddle point at (xg, yg) if 
(ко, Уб) yy (xo. Уй) — зу (хо, уа) <0 
(d) The test is inconclusive if 


f xx (x. YS yy (x. YO) — Sty (x0. уй) =0 


Our interest here is in showing how to reformulate this theorem using properties of symmetric matrices. For this 
purpose we consider the symmetric matrix 


H( ) xx y) Jay у) 
x = 
d J xy, y) Í yy у) 


which is called the Hessian or Hessian matrix of fin honor of the German mathematician and scientist Ludwig 
Otto Hesse (1811—1874). The notation E (x, у) emphasizes that the entries in the matrix depend on x and y. The 
Hessian is of interest because 


J xx(%0, y0) J xy(xo. уй) 
det| H (xg, уп) | = 


= _ 72 
xy. yo). f yy (x0. vo) =F xx(xg. УО). py (Xo. Уй) — xy (x0. YO) 


is the expression that appears in Theorem 7.4.2. We can now reformulate the second derivative test as follows. 


THEOREM 7.4.3 Hessian Form of the Second Derivative Test 


Suppose that (xg, yg) is a critical point of ў (x, у) and that fhas continuous second-order partial 
derivatives in some circular region centered at (x9, yg). If H (xg, yg) is the Hessian of fat (xg, yg), then: 


(a) fhas a relative minimum at (xg, yg) if (xg, yg) is positive definite. 
(b) fhas a relative maximum at (x9, yg) if (хо, yg) is negative definite. 
(c) fhas a saddle point at (xo, yg) if (xg, yg) is indefinite. 


(d) The test is inconclusive otherwise. 


We will prove part (a). The proofs of the remaining parts will be left as exercises. 


Proof (a) If H(x is positive definite, then Theorem 7.3.4 implies that the principal submatrices of 
(хо, >й) princip 
H (xg, уп) have positive determinants. Thus, 


J xx(%0, Уй) J xy. yo) 2 | 
det[Z(xg, у0)] = fy. Yo) Дуб Уй = f xx(xü. У0)Луу\хп, YO) S xy Oo, yo) > 9 


and 


det[ 7 xx (xg. Yo)] =F xx(xo. yo) > 0 
so f has a relative minimum at (хү, yg) by part (a) of Theorem 7.4.2. 


EXAMPLE 4 Using the Hessian to Classify Relative Extrema + 


Find the critical points of the function 


ә) 4 ху? = 8ху + 5 


and use the eigenvalues of the Hessian matrix at those points to determine which of them, if any, are 
relative maxima, relative minima, or saddle points. 


Solution To find both the critical points and the Hessian matrix we will need to calculate the first 
and second partial derivatives of f. These derivatives are 


2 
Slt, y) ox y! 8y, /у(х, у) = 2xy - 8x, /уу(х, y) 2p - 8 
Jxx(x, y) = 2x, Ј py, >)=2х 
Thus, the Hessian matrix is 
H SF xx(%, у) SF xy(%, у) 2x 2у = 8 
X, ¥ = = 
JayGQ у) J yy (x, у) 2y —8 2x 
To find the critical points we set f , and f y equal to zero. This yields the equations 
f(x, y) =x? y —8y —0 and f(x, y) =2xy - 8х = 2x(y - 4) 50 
Solving the second equation yields x — () or y — 4. Substituting x — () in the first equation and 
solving for y yields у — () or y — 8; and substituting y — 4 into the first equation and solving for x 
yields x — 4 or x — — 4. Thus, we have four critical points: 
(0,0), (0,89, (4,4), (—4,4) 


Evaluating the Hessian matrix at these points yields 


н@,оу= | е E нов = | | 


8 0 8 0 
H4 9-1 | C452 E 


We leave it for you to find the eigenvalues of these matrices and deduce the following classifications 
of the stationary points: 
Critical Point (хо, yo) | № | 22 Classification 
Saddle point 
Saddle point 


OPTIONAL 


We conclude this section with an optional proof of Theorem 7.4.1. 


Proof of Theorem 7.4.1 The first step in the proof is to show that 4x has constrained maximum and minimum 
values for ||x|| = 1. Since А is symmetric, the principal axes theorem (Theorem 7.3.1) implies that there is an 
orthogonal change of variable x — Py such that 


x1 Ax — Ay? Ау ВА (6) 


in which Ay, Аз, ..., Ау are the eigenvalues of A. Let us assume that ||x|| = 1 and that the column vectors of Р 
(which are unit eigenvectors of 4) have been ordered so that 


A BA · · · РА, (7) 


Since ће matrix P is orthogonal, multiplication by P is length preserving, so that ||у || = ||x|] = 1; that is, 
2 2 2 
rye coe md 
It follows from this equation and 7 that 
2 2 2 - 2 2 2 
хл=м{у Буд: уя) < му+ Ау +++ Awi 
- 2 2 2 
= xbi a ie ya J=At 
and hence from 6 that 
M <x Ах £A 
This shows that all values of x 7 4y for which ||x|| = 1 lie between the largest and smallest eigenvalues of А. Now 
let x be a unit eigenvector corresponding to Ay. Then 


x? Ax = x" (ux) = \yx?x = А = X 
which shows that у 7 дұ has Ду as a constrained maximum and that this maximum occurs if x is a unit eigenvector 
of A corresponding to Ду. Similarly, if x is a unit eigenvector corresponding to Ap, then 

x? Ах = x^ Qu] = мых? x = А, = А, 


so x 7 4x has А, as а constrained minimum and this minimum occurs if x is a unit eigenvector of A corresponding 


to Ay. This completes the proof. 


Concept Review 


Constraint 


Constrained extremum 


Level curve 


Critical point 


Relative minimum 


Relative maximum 


Saddle point 


Second derivative test 


Hessian matrix 


Skills 
* Find the maximum and minimum values of a quadratic form subject to a constraint. 


* Find the critical points of a real-valued function of two variables, and use the eigenvalues of the Hessian 
matrix at the critical points to classify them as relative maxima, relative minima, or saddle points. 


Exercise Set 7.4 


In Exercises 1—4, find the maximum and minimum values of the given quadratic form subject to the constraint 
х? | y? — ], and determine the values of x and y at which the maximum and minimum occur. 


1. 5x2 -y! 
Answer: 


Maximum: 5 at (1, 0) and ( — 1, 0); minimum: —] at (0, 1) and (0, = 1) 


2. xy 
3. 332 4 Ty? 
Answer: 


Maximum: 7 at (0, 1) and (0, —1); minimum: 3 at (1, 0) and (—1, 0) 
4. 5x? 4 5xy 
In Exercises 5—6, find the maximum and minimum values of the given quadratic form subject to the constraint 
x^ y +z -1 


and determine the values of x, y, and z at which the maximum and minimum occur. 


5. 9x? 4 dy? + 322 


Answer: 


Maximum: 9 at (1, 0, 0) and (—1, 0, 0); minimum: 3 at (0, 0, 1) and (0, 0, —1) 
6. 2x? + y? +22 + 2ху + 2х2 
7. Use the method of Example 2 to find ће maximum and minimum values of xy subject to the constraint 
2 2 
4x^ + 8y^ = 16. 


Answer: 


Maximum: z — 4ү2 at (х,у) = (202, 2) апа (- 2y2, — 2}; minimum: 2 — -4J2 at 
(x,y) = (- 2ү2, 2) and (202, - 2) 

8. Use the method of Example 2 to find the maximum and minimum values of x? xy + 2у? subject to the 
constraint x^ + зу? = 16. 

In Exercises 9—10, draw the unit circle and the level curves corresponding to the given quadratic form. Show that 


the unit circle intersects each of these curves in exactly two places, label the intersection points, and verify that 
the constrained extrema occur at those points. 


9. 52 у“ 
Answer: 
542 у? = 
10. ху 


m (a) Show that the function 7 (x, y) =4ху— i -yí has critical points at (0, 0), (1, 1), and ( — 1, — 1). 


(b) Use the Hessian form of the second derivative test to show f has relative maxima at (1, 1) and ( = 1, = 1) 
and a saddle point at (0, 0). 


a) Show that the function 7 (x ‚У | =z- бху =y * has critical points at (0, 0) and ( — 2, 2). 


(b) Use the Hessian form of the second derivative test to show fhas a relative maximum at ( — 2, 2) anda 
saddle point at (0, 0). 


In Exercises 10-13, find the critical points of f, if any, and classify them as relative maxima, relative minima, or 
saddle points. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


f (x, уу=х?—3ху—у? 
Answer: 


Critical points: (—1, 1), relative maximum; (0, 0), saddle point 


J (zy) =z? – Зх і у? 

y к.) =х24 2у? —х?у 

Answer: 

Critical points: (0, 0), relative minimum; (2, 1) and (—2, 1), saddle points 
(уђе? | y? = 3x —3y 


A rectangle whose center is at the origin and whose sides are parallel to the coordinate axes is to be inscribed 
in the ellipse x? 4 25у? = 25. Use the method of Example 2 to find nonnegative values of x and y that 


produce the inscribed rectangle with maximum area. 


Answer: 


Corner points: * — 45 у= T 


Suppose that the temperature at a point (х, у) on a metal plate is T(z ‚у | = 4x? — Axy +y". An ant, walking 


on the plate, traverses a circle of radius 5 centered at the origin. What are the highest and lowest temperatures 
encountered by the ant? 


(a) Show that the functions 
Í ky) = хі фу and g(x,y] =x*-y* 
have a critical point at (0, 0) but the second derivative test is inconclusive at that point. 


(b) Give a reasonable argument to show that fhas a relative minimum at (0, 0) and g has a saddle point at (0, 
0). 


Suppose that the Hessian matrix of a certain quadratic form 7 (x, y) is 


= 


What can you say about the location and classification of the critical points of /? 


Suppose that A is an » x; у symmetric matrix and 
g (=) =х Ах 


where x is a vector іп 2” that is expressed in column form. What can you say about the value of q if x is a unit 
eigenvector corresponding to an eigenvalue À of A? 


Answer: 


q(x) =A 


22. Prove: If х7 4y is a quadratic form whose minimum and maximum values subject to the constraint ||x|| = 1 


are m and M, respectively, then for each number c in the interval m < c < M, there is a unit vector X¢ such that 
x? Ax, = с: [Hint: In the case where yy < M, let Um and Ug be unit eigenvectors of A such that ц? ду, =» 


and «Аа м = №, and let 
LI | M —c c —» 
Xe = y —— ——uy F >| —————u 
E Mom ™ M-m м 


Show that x? Ax, = с] 
True-False Exercises 
In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 
(a) A quadratic form must have either a maximum or minimum value. 

Answer: 


False 


(b) The maximum value of a quadratic form x 7 Ах subject to the constraint ||х|| = 1 occurs at a unit eigenvector 
corresponding to the largest eigenvalue of А. 


Answer: 


True 


(c) The Hessian matrix of a function f with continuous second-order partial derivatives is a symmetric matrix. 
Answer: 


True 


(d) If (xq, уп) is a critical point of a function f and the Hessian of fat (xp, yg) is 0, then fhas neither a relative 
maximum nor a relative minimum at (хо, yo) 


Answer: 


False 


(e) If A is a symmetric matrix and det A < Q, then the minimum of x T Ax Subject to the constraint ||x|| = 1 is 


negative. 
Answer: 


True 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


7.5 Hermitian, Unitary, and Normal Matrices 


We know that every real symmetric matrix is orthogonally diagonalizable and that the real symmetric matrices 
are the only orthogonally diagonalizable matrices. In this section we will consider the diagonalization problem 
for complex matrices. 


Hermitian and Unitary Matrices 


The transpose operation is less important for complex matrices than for real matrices. A more useful operation 
for complex matrices is given in the following definition. 


DEFINITION 1 


If A is a complex matrix, then the conjugate transpose of A, denoted by 4", is defined by 


A a" (1) 


aT A 
Remark Since part (b) of Theorem 5.3.2 states that (4 "| = (4) , the order in which the transpose and 


conjugation operations are performed in computing 4* — 4 T does not matter. Moreover, in the case where А 


T 


* = x E 
has real entries we have 4 = (A) = P so Д“ is the same as 47 for real matrices. 


EXAMPLE 1 Conjugate Transpose <& 


Find the conjugate transpose 4" of the matrix 
i= l+i =i | j 
2 3—2: 
Solution We have 


- 1-i i 0 ж 
= | 2 3415 и and hence A =A = | i 3 ба 


The following theorem, parts of which are given as exercises, shows that the basic algebraic 
properties of the conjugate transpose operation are similar to those of the transpose (compare to 
Theorem 1.4.8). 


ТНЕОКЕМ 7.5.1 


If k is a complex scalar, and if А, B, апа C are complex matrices whose sizes are such that the stated 
operations can be performed, then: 


@ (a^) =A 

© [A+B] =a" 48" 
(c) (а-в) =" в" 
@ (ka) - ka 

© (аву =в'а" 


Remark Note that the relationship u - v = v Ty in Formula 5 of Section 5.3 can be expressed in terms of the 


conjugate transpose as 


u-v—vu (2) 


We are now ready to define two new classes of matrices that will be important in our study of diagonalization 


in C”. 


DEFINITION 2 


A square complex matrix A is said to be unitary if 
Ad-4" (3) 
and is said to be Hermitian ^ if 


А =A (4) 


Note that a unitary matrix can also be defined 


as a square complex matrix А for which 


AA =A A=] 


If А is a real matrix, then 4" — 47, in which case 3 becomes AT! = 47 and 4 becomes АТ — А. Thus, the 


unitary matrices are complex generalizations of the real orthogonal matrices and Hermitian matrices are 
complex generalizations of the real symmetric matrices. 


EXAMPLE 2 Recognizing Hermitian Matrices — 


Hermitian matrices are easy to recognize because their diagonal entries are real (why?), and the 
entries that are symmetrically positioned across the main diagonal are complex conjugates. Thus, 
for example, we can tell by inspection that 


1 i l-4i 
А=| =i -5 2-і 
1-i 24i 3 


is Hermitian. 


The fact that real symmetric matrices have real eigenvalues is a special case of the following more general 
result about Hermitian matrices, the proof of which is left for the exercises. 


THEOREM 7.5.2 


The eigenvalues of a Hermitian matrix are real numbers. 


The fact that eigenvectors from different eigenspaces of a real symmetric matrix are orthogonal is a special 
case of the following more general result about Hermitian matrices. 


THEOREM 7.5.3 


If A is a Hermitian matrix, then eigenvectors from different eigenspaces are orthogonal. 


Proof Let v, and V2 be eigenvectors of A corresponding to distinct eigenvalues Aj and Аз. Using Formula 2 
and the facts that Ay = Ay, Аз = Аз, and А = Аж we can write 


Ду (жд * v1) = (A1v1)*v3 


(Avi) *v; = UEM 


(vi Ava = v CAv3) 


v (Agv2) = Аз (гүз) = AÀ3(v3* vi) 


This implies that (Ay — Az) (v3 * v4) = 0 and hence that v5 - v = 0 (since Ay # Аз). 


EXAMPLE 3 Eigenvalues and Eigenvectors of a Hermitian Matrix — 


А— 2 1+ 
1-i 3 


has real eigenvalues and that eigenvectors from different eigenspaces are orthogonal. 


Confirm that the Hermitian matrix 


Solution The characteristic polynomial of A is 

A=2 -1-i 
=1+1 А-3 
(A—2)(4—3)—(—1-0)(—1-4 1i) 


Е (2-54 6)-2- (A= 1)(A—4) 


det(A- A) = 


so the eigenvalues of A are А — 1 and 4 = 4, which are real. Bases for the eigenspaces of A can be obtair 


by solving the linear system 
A-2 -1l-i||*i| |0 
-14i А-3 [х2] |0 


with А = 1 and with Д = 4. We leave it for you to do this and to show that the general solutions of these 


systems are 
; 1 : 
а Ар j4-1-i ар s D 
А= IME | 1 | and А=4: ME 2 


Thus, bases for these eigenspaces are 

1 А 
z(l 
TED 


А= 1: "2| and À—4: жз = 
1 


The vectors V, and v2 are orthogonal since 
vv - (-1- i) а i | OX) -icl y d)d 120 


and hence all scalar multiples of them are also orthogonal. 


Unitary matrices are not usually easy to recognize by inspection. However, the following analog of Theorems 
7.1.1 and 7.1.3, part of which is proved in the exercises, provides a way of ascertaining whether a matrix is 


unitary without computing its inverse. 


THEOREM 7.5.4 


If A is an у ж х matrix with complex entries, then the following are equivalent. 

(a) A is unitary. 

(b) || 4x|| = ||х|| for all x inc”. 

(c) Ax: Ау = х · у forall x and y in С”. 

(d) The column vectors of A form an orthonormal set in С" with respect to the complex Euclidean 
inner product. 


(e) The row vectors of A form an orthonormal set іп Œ” with respect to the complex Euclidean inner 
product. 


EXAMPLE 4 AUnitary Matrix + 


Use Theorem 7.5.4 to show that 


Mle pole 
A 
— 
l - 
=. З 
b — 
holo 
2 UND 
| 
— 
+ 
а) 
b MM d 


is unitary, and then find 4-1. 


Solution We will show that the row vectors 


n-[3a«8 $a«5| and &-|3a-» $c1«»| 


are orthonormal. The relevant computations are 


2 2 
THE +ра+ә jid 


llrı ll 


Ir || 


nn = (249 о) а-ә) + (F042) 


(ia «Jia E (ia +0 


(214 2 


LL M 
oe, 
Jo 


2 
ПИ Ис od... 
loi )) li- lio 


Since we now know that A is unitary, it follows that 


"n 211-3) i Hi 
ii) iC] 


You can confirm the validity of this result by showing that 44" = 4" 4 — [. 


Unitary Diagonalizability 


Since unitary matrices are the complex analogs of the real orthogonal matrices, the following definition is a 
natural generalization of orthogonal diagonalizability for real matrices. 


DEFINITION 3 


A square complex matrix 15 said to be unitarily diagonalizable if there is a unitary matrix P such that 
P'AP — Dis a complex diagonal matrix. Any such matrix P is said to unitarily diagonalize A. 


Recall that a real symmetric » x; у matrix A has an orthonormal set of n eigenvectors and is orthogonally 
diagonalized by any » x » matrix whose column vectors аге an orthonormal set of eigenvectors of А. Here is 
the complex analog of that result. 


THEOREM 7.5.5 


Every y x » Hermitian matrix A has an orthonormal set of n eigenvectors and is unitarily diagonalized 
by any у x n matrix P whose column vectors form an orthonormal set of eigenvectors of А. 


The procedure for unitarily diagonalizing a Hermitian matrix А is exactly the same as that for orthogonally 
diagonalizing a symmetric matrix: 


Unitarily Diagonalizing a Hermitian Matrix 


Step 1. Find a basis for each eigenspace of A. 


Step 2. Apply the Gram-Schmidt process to each of these bases to obtain orthonormal bases for the 
eigenspaces. 


Step 3. Form the matrix P whose column vectors аге the basis vectors obtained in Step 2. This will 
be a unitary matrix (Theorem 7.5.4) and will unitarily diagonalize A. 


EXAMPLE 5 Unitary Diagonalization of a Hermitian Matrix + 


Find a matrix P that unitarily diagonalizes the Hermitian matrix 


T 2 li 
1-i 3 


Solution We showed in Example 3 that the eigenvalues of А аге  — | and д — 4 and that bases 
for the corresponding eigenspaces are 


| d 
А1 |У and А= 4:93 = 2 Fi) 
1 


Since each eigenspace has only one basis vector, the Gram-Schmidt process is simply a matter of 
normalizing these basis vectors. We leave it for you to show that 


=] =i 14i 

үз : y6 

r= a | 8Р2] 2 
үз ув 


Thus, А is unitarily diagonalized by the matrix 


Although it is a little tedious, you may want to check this result by showing that 


eis od. -1-i 1+: 

. үз үз ig B 06 | по 

dini pa 3 | i 2 =|; Д 
y6 ye B |6 


Skew-Symmetric апа Skew-Hermitian Matrices 


In Exercise 37 of Section 1.7 we defined a square matrix with real entries to be skew-symmetric if ДЇ — — А. 


A skew-symmetric matrix must have zeros on the main diagonal (why?), and each entry off the main diagonal 


must be the negative of its mirror image about the main diagonal. Here is an example. 


0 1 -2 
A—|—1 0 4| [skew- symmetric] 
2 —-4 0 
We leave it for you to confirm that 47 — — А. 
The complex analogs of the skew-symmetric matrices are the matrices for which 4" — — 4. Such matrices are 


said to be skew-Hermitian. 


Since a skew-Hermitian matrix A has the property 


it must be that А has zeros or pure imaginary numbers on the main diagonal (why?), and that the complex 
conjugate of each entry off the main diagonal is the negative of its mirror image about the main diagonal. Here 
is an example. 


i 1-i 5 
A—|-—1-i 2i i| [skew— Hermitian] 
-5 i 0 


Normal Matrices 


Hermitian matrices enjoy many, but not all, of the properties of real symmetric matrices. For example, we 
know that real symmetric matrices are orthogonally diagonalizable and Hermitian matrices are unitarily 
diagonalizable. However, whereas the real symmetric matrices are the only orthogonally diagonalizable 
matrices, the Hermitian matrices do not constitute the entire class of unitarily diagonalizable complex matrices; 
that is, there exist unitarily diagonalizable matrices that are not Hermitian. Specifically, it can be proved that a 
square complex matrix A is unitarily diagonalizable if and only if 

AA  — A' A 
Matrices with this property are said to be normal. Normal matrices include the Hermitian, skew-Hermitian, 
and unitary matrices in the complex case and the symmetric, skew-symmetric, and orthogonal matrices in the 
real case. The nonzero skew-symmetric matrices are particularly interesting because they are examples of real 
matrices that are not orthogonally diagonalizable but are unitarily diagonalizable. 


A Comparison of Eigenvalues 


We have seen that Hermitian matrices have real eigenvalues. In the exercises we will ask you to show that the 
eigenvalues of a skew-Hermitian matrix are either zero or purely imaginary (have real part of zero) and that the 
eigenvalues of unitary matrices have modulus 1. These ideas are illustrated schematically in Figure 7.5.1. 


y 
Pure imaginary 
eigenvalues 
(skew-Hermitian) 


jA] = 1 (unitary) 


Real eigenvalues 
(Hermitian) 


Figure 7.5.1 


Concept Review 

* Conjugate transpose 

* Unitary matrix 

* Hermitian matrix 

* Unitarily diagonalizable matrix 
* Skew-symmetric matrix 

* Skew-Hermitian matrix 


* Normal matrix 


Skills 

* Find the conjugate transpose of a matrix. 
* Beable to identify Hermitian matrices. 

* Find the inverse of a unitary matrix. 


* Find a unitary matrix that diagonalizes a Hermitian matrix. 


Exercise Set 7.5 


In Exercises 1—2, find д“. 


1. ài 1-i 
А=| 4 34-i 
5 +3 0 
Answer: 


« [=-2 4 5-і 
=], 3—1 "d 


24 [2 1-i -143 
45-% <i 


In Exercises 3-4, substitute numbers for the x's so that A is Hermitian. 


3. 1 i 2—33 
А=| х —3 1 
x x 2 
Answer: 
1 і 2—3; 
А=| =i -3 1 
243 1 2 
4 2 0 3+5 
A=|x -4 <i 
x x 6 


In Exercises 5—6, show that 4 is not Hermitian for any choice of the x's. 


5. (a) 1 i 2-3 
А=| = -3 х 
2=3 x х 
(b) x x 3453 
А= 0 i -i 
3—5 i x 
Answer: 


(а) 413 * 231 
(b) 222 * 422 


6. (a) 1 1+i x 
A=|1+i 7 x 
6—2i x 0 

(b) 1 x 345 


А= х 3 1—: 
3—51 x 2+ 


In Exercises 7-8, verify that the eigenvalues of the Hermitian matrix А are real and that eigenvectors from 
different eigenspaces are orthogonal (see Theorem 7.5.3). 


In Exercises 9—12, show that A is unitary, and find 41. 


9. 3 4, 
5 5 
A= 
_4 3, 
3-9 
Answer: 
3 A 
* E 5 5 
A -4d- 
E E 
5 5 
10. zT. 


ы 

| 

TS] 
s 


| 
hol] 
——. 
— 
+ 
ы. 
LA 
holo 
OO 
— 
4 
Кеп 
S 


Answer 
-i+ үз 1-iy3 
дад 202 2y 2 
О АЙ ауз 
2y 2 2y 2 
12 1 : 1 
: —=(—1+) —=(1—) 
ГО 


d. 2 


үз y6 


In Exercises 13—18, find a unitary matrix P that diagonalizes the Hermitian matrix А, and determine p 71 АР. 


13. 4 1-i 
A= 
na 5 | 


Answer: 


А= 0 1 
—2—33 —1 4 
20 0 0 3—5 
A=|x 0 -i 
x x 0 


In Exercises 21—22, show that А is not skew-Hermitian for any choice of ће x's. 


21. (а) 0 i 2-3 
= —; 0 х 
2+3 x x 
(b) 1 x 3—5 
A= x 2i -i 
=—3453 i 3i 
Answer: 


(а) 213 — 231 
(b) ayy —à11 


22. (а) i x 2-31 
A=| x 0 14i 
2-31 —1— x 
(b) 0 =i 4+1 
А= х 0 x 
—4—7 x 1 


In Exercises 23—24, verify that the eigenvalues of the skew-Hermitian matrix А are pure imaginary numbers. 


| 0 p 


l4i i 
24 0 3i 
“A= 
МЧ 


In Exercises 25-26, show that А is normal. 


25. 1+2 24i —2—: 
А=| 2+i 1+і <i 
—2—i =i 1+1 

26. 2+ 2i i 1-i 

А= i —2üi 1—7 

l=; 1-353 -3-48i 


27. Show that the matrix 


28. 


29. 


30. 


31. 


32. 
33. 
34. 


35. 


36. 
37. 


39. 


[д -B 
1 e e 
А=—=| . Е 
ү2 bs к 
is unitary for all real values of 0. [Note: See Formula 17 in Appendix B for the definition of „?®.] 


Prove that each entry on the main diagonal of a skew-Hermitian matrix is either zero or a pure imaginary 
number. 


Let A be any » x » matrix with complex entries, and define the matrices В and C to be 


Em * ER _ * 
в= А+") and С=4.[А 4" 


(a) Show that В and C are Hermitian. 
(b) Show that 4 = 8 + iC and A" = B — iC. 
(c) What condition must В and C satisfy for А to be normal? 


Answer: 


(c) B and C must commute. 


Show that if А is an » x д matrix with complex entries, and if u and v are vectors in C” that are expressed 
in column form, then 


Au-v—u-A'v and u:-4v=Au-v 
Show that if A is a unitary matrix, then so is А“. 
Show that the eigenvalues of a skew-Hermitian matrix are either zero or purely imaginary. 
Show that the eigenvalues of a unitary matrix have modulus 1. 


Show that if u is a nonzero vector іп C” that is expressed in column form, then P — yy” is Hermitian. 


Show that if u is a unit vector in C” that is expressed in column form, then 27 = 7 — 2uu' is Hermitian and 
unitary. 
What can you say about the inverse of a matrix A that is both Hermitian and unitary? 


Find a 2 х 2 matrix that is both Hermitian and unitary and whose entries are not all real numbers. 
Answer: 
E EE I 
(2. ү 
i 1 


What geometric interpretations might you reasonably give to multiplication by the matrices P — yy” and 


H = 1 — 2wa in Exercises 34 and 35? 


Answer: 


Multiplication of x by P corresponds to ||u || ? times the orthogonal projection of x onto W = span {u} . If 


| = 1, then multiplications of x by ¥ = J — 2uu' corresponds to reflection of x about the hyperplane u- 


x71 * 
ui Prove that if A is an invertible matrix, then 4” is invertible, and (4 | = (4 1) { 
41. (a) Prove that (А) = det(A). 
(b) Use the result in part (a) and the fact that a square matrix and its transpose have the same determinant 
жү раа 
to prove that det (4 = det(A). 


42. Use part (b) of Exercise 41 to prove: 

(a) If A is Hermitian, then det(A) is real. 

(b) If A is unitary, then |det(.4) | = 1. 
43. Use properties of the transpose and complex conjugate to prove parts (a) and (e) of Theorem 7.5.1. 
44. Use properties of the transpose and complex conjugate to prove parts (b) and (d) of Theorem 7.5.1. 


45. Prove that an » x matrix with complex entries is unitary if and only if the columns of A form an 
orthonormal set іп C”. 


46. Prove that the eigenvalues of a Hermitian matrix are real. 


True-False Exercises 


In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 


(a) The matrix [К | 1s Hermitian. 
i 


Answer: 
False 
(b) Mm NES C аг 

y2 ye уз 

The matrix 0 ET T is unitary. 
y2 ye уз 

Answer: 

False 


(c) The conjugate transpose of a unitary matrix is unitary. 


Answer: 


True 


(d) Every unitarily diagonalizable matrix is Hermitian. 
Answer: 


False 


(e) A positive integer power of a skew-Hermitian matrix is skew-Hermitian. 
Answer: 


False 
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Chapter 7 Supplementary Exercises 


1. Verify that each matrix is orthogonal, and find its inverse. 


(a) |3 .4 
5 —5 
4 3 
5 5 
[а 3 
| $9 -$ 
-3 4 12 
25 5 725 
12 3 16 
25 5 25 
Answer 
(а) [з 4] 3 4 
5-51 | $$ 
4 3| |_4 3 
5 5 5 5 
Or 4, 31 [а 9р 
5 5 5 —25 25 
оа 2| _| 9 4 3 
25 5 —25 5 5 
123 16] |_3 _12 16 
25 5 25 5 —25 25 


2. Prove: If О is an orthogonal matrix, then each entry of О is the same as its cofactor if det((2) = 1 and is 
the negative of its cofactor if det(Q) = = 1. 


3. Prove that if A is a positive definite symmetric matrix, and if u and v vectors in R” in column form, then 
(u, v}=u T Ay 
is an inner product on R”. 


4. Find the characteristic polynomial and the dimensions of the eigenspaces of the symmetric matrix 


3 2 2 
23 2 
22 3 
5. Find a matrix P that orthogonally diagonalizes 
1 0 1 
А=|0 1 0 
10 1 


and determine the diagonal matrix rj — pT др. 


Answer: 


Е. РЕ 

y2 {2 ооо 
P-| 0 0 1[; P74P=/0 2 0 

1 dig 001 

{2 {2 


6. Express each quadratic form in the matrix notation у 7 4x. 
(a) —4x? | 16x2 = 15x1x2 
(b) 9x? – х2 + 4х2 + бхухо — 8x4x3 + X2X3 
7. Classify the quadradic form 
H = 3x1x244 4x4 
as positive definite, negative definite, indefinite, positive semidefinite, or negative semidefinite. 


Answer: 


positive definite 


8. Find an orthogonal change of variable that eliminates the cross product terms in each quadratic form, and 
express the quadratic form in terms of the new variables. 


(a) —3x? | 5x3 + 2x 1х2 
(b) —5x? + 2: -x2 + бхүхз + 4хух2 
9. Identify the type of conic section represented by each equation. 
(a) у=х2=0 
(b) 3x 2 115? 20 


Answer: 


(a) parabola 
(b) parabola 


10. Find a unitary matrix U that diagonalizes 


© = m 
сч — C 


and determine the diagonal matrix D= 77 ^1 AU. 


11. Show that if U is an у x x unitary matrix and 


== = = 1 


then the product 


is also unitary. 
12. Suppose that 4" — — 4. 
(a) Show that i4 1s Hermitian. 


(b) Show that А 1s unitarily diagonalizable and has pure imaginary eigenvalues. 
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| CHAPTER u 


Linear Transformations 


CHAPTER CONTENTS 


8.1. General Linear Transformations 

8.2. Isomorphism 

8.3. Compositions and Inverse Transformations 
8.4. Matrices for General Linear Transformations 


8.5. Similarity 


INTRODUCTION 


In Section 4.9 and Section 4.10 we studied linear transformations from R” to R™. In this 
chapter we will define and study linear transformations from a general vector space V to a 
general vector space W. The results we obtain here have important applications in physics, 
engineering, and various branches of mathematics. 
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8.1 General Linear Transformations 


Up to now our study of linear transformations has focused on transformations from R” to R™. In this section we 
will turn our attention to linear transformations involving general vector spaces. We will illustrate ways in which 
such transformations arise, and we will establish a fundamental relationship between general n-dimensional vector 
spaces and R”. 


Definitions and Terminology 


In Section 4.9 we defined a matrix transformation T A R” — R" to be a mapping of the form 
T a(x) = Ах 
in which A is an jj; x у matrix. We subsequently established in Theorem 4.10.2 and Theorem 4.10.3 that the matrix 
transformations are precisely the linear transformations from R” to R™, that is, the transformations with the 
linearity properties 
T(u-4-v)-— T(u)--T(v) and T(ku)-—kT(u) 


We will use these two properties as the starting point for defining more general linear transformations. 


DEFINITION 1 


If 7: ¥ — W is a function from a vector space V to a vector space W, then 7 15 called a linear 
transformation from V to W if the following two properties hold for all vectors и and v in V and for all 
scalars К: 


(i) (ku) = Т (и) [Homogeneity property] 

(ii) T(u-- v) = 7(u) + T(v) [Additivity property] 

In the special case where p — W, the linear transformation T is called a linear operator on the vector space 
V. 


The homogeneity and additivity properties of a linear transformation Т: — }ў can be used in combination to 
show that if v; and v2 are vectors in V and ky and & are any scalars, then 


Туур + kawa) = k1T(v1) + kgT(v3) 


More generally, if v1, уз, ..., v, are vectors in V and ky, #2, ..., ky are any scalars, then 


T(k(w4-4-k9v3 + ++ + Ev) =k, Tw) + eo Oa) +--+ + ATC) (1) 


The following theorem is an analog of parts (a) and (d) of Theorem 4.9.1. 


THEOREM 8.1.1 


If 7: —› W is a linear transformation, then: 


(a) T(0) =0. 
(b) T(u — v) — T(u) — T(v) for all u and v in V. 


Proof Letu be any vector in V. Since Oy = Q, it follows from the homogeneity property in Definition 1 that 


T(0) = T(0u) — 0T(u) —0 


which proves (a). 


We can prove part (b) by rewriting Z(u — v) as 


T(u—-v) = Т(и++(—1)у) 
= T(w-4(-1)7(v) 
= 7Т(м)—7(у) 


We leave it for you to justify each step. 


Use the two parts of Theorem 8.1.1 to prove that 
T(—v)—-—v 


for all v in V. 
EXAMPLE 1 Matrix Transformations + 
Because we have based the definition of a general linear transformation on the homogeneity and 


additivity properties of matrix transformations, it follows that a matrix transformation T A R” — g" is 
also a linear transformation in this more general sense with [/ = R” and WW = А”. 


EXAMPLE 2 The Zero Transformation + 


Let V and W be any two vector spaces. The mapping T: 7 — W such that Z(v) = 0 for every vin V isa 
linear transformation called the zero transformation. To see that T is linear, observe that 


T(u--v) —0, 7(u)—-0, 7(v)—0, and 7(ku)-—0 
Therefore, 


T(u--v) = T(u) + T(v) and T(ku) = kT(u) 


EXAMPLE 3 The Identity Operator + 


Let V be any vector space. The mapping 7: — ¥ defined by (v) = v is called the identity operator on 
V. We will leave it for you to verify that / is linear. 


EXAMPLE 4 Dilation and Contraction Operators + 


If V is a vector space and ќ is any scalar, then the mapping T: y — p given by T(x) = kx is a linear 
operator оп V, for if c is any scalar and if и and v are any vectors in V, then 


T(cu) = (си) —c(ku) —cT(u) 

T(u-F v) = (а + v) = ku + kv = T(u) + T(v) 
If Q = 5 <= 1, then T is called the contraction of V with factor k, and if ic — 1, it is called the dilation of V 
with factor k (Figure 8.1.1). 


Vw. / kx “| x 


Dilation of V Contraction of V 


Figure 8.1.1 


EXAMPLE 5 ALinear Transformation from Pn to Рр «4 <4 


Let p = p(x) =co tH cix tH + + сык” be a polynomial in P,,, and define the transformation 
T: Py — Руф by 

тір) = T(P) =x p(x) =cox + сух? +++ + сых”*! 
This transformation is linear because for any scalar k and any polynomials рі and p2 in P,, we have 


T(kp) = T(kp(x)) =x kp (x) = &xp(x)) = &T(p) 


and 


T(pi(x) + p2(x)) = x(pi(x) + р2(х)) 
= xpi(x) +xp2(x) = T(pi) + T(p5) 


T(pi + p2) 


EXAMPLE 6 ALinear Transformation Using an Inner Product + 


Let V be an inner product space, let ¥g be any fixed vector in V, and let 7:7 — R be the transformation 
T(x) = (x, vg} 


that maps a vector x into its inner product with vg. This transformation is linear, for if k is any scalar, and 
if u and v are any vectors in V, then it follows from properties of inner products that 


T(ku) = (ku, vo) — k(u, vo) =kT(u) 
T(u + v) = (u +v, vg) = (0, vo) + (v. vo} = 7(u) + T(v) 


EXAMPLE 7 Transformations on Matrix Spaces + 


Let M „p be the vector space of » хм matrices. In each part determine whether the transformation is 
linear. 


(а) T (4) = АТ 
(b) T2CA4) = det(A) 


Solution 
(a) It follows from parts (b) and (d) of Theorem 1.4.8 that 


T (кл) = (kA)? = БАТ = АТ! (4) 
7, (4 | В| = (А +B)? = AT 4 2^ Ti (A) | 7; (В) 


so 7 is linear. 
(b) It follows from Formula 1 of Section 2.3 that 
Т? (kA) = det (kA) = k"det(A) = ЕТ: (4) 
Thus, 772 is not homogeneous and hence not linear 1Ё›; > 1. Note that additivity also fails 


because we showed in Example 1 of Section 2.3 that деї( А + 3) and det( А) + det(3) are not 
generally equal. 


EXAMPLE 8 Translation Is Not Linear + 


Part (a) of Theorem 8.1.1 states that a linear transformation maps 0 to 0. This property is useful for 
identifying transformations that are not linear. For example, if Xp is a fixed nonzero vector in 22, then 


the transformation 

T(x) =х+ х0 
has the geometric effect of translating each point x in a direction parallel to х0 through a distance of 
||xg|| (Figure 8.1.2). This cannot be a linear transformation since 7(0) = xg, so T does not map 0 to 0. 


Figure 8.1.2 T(x) = x + xy translates each point x along a line parallel to xg through a distance 
llxoll- 


EXAMPLE 9 The Evaluation Transformation + 


Let V be a subspace of F ( — со, со), let 
E xa. 


be distinct real numbers, and let T: / — R” be the transformation 


ТОР) = CF (xp). F2). Fx) Q) 


that associates with f the n-tuple of function values at x1, X2, ..., Ху. We call this the evaluation 
transformation on V at x, X3, ..., Ху. Thus, for example, if 

ху= = 1, x9=2, х3=4 
and if f (x) = x? — 1, then 


ТОР) = CF (xp). f 2), f (3)) = (0, 3, 15) 


The evaluation transformation in 2 is linear, for if k is any scalar, and if fand g are any functions in V, 


then 
T(kf) = (IE), E)E- GF) 
= (Kf (x1), kf (x2)... kf (x0)) 
= KF (х1), J 2... J (хь)) SKTS) 
апа 


TF +g) = (7 +8) (х1), (У +8) (х2), -o С +8) En) 

CF Gu) + 071), f (х2) + (2)... (ки) + gx) 
Cf Gu. Р (2), 0090) + (er), (х2)... Bn) 
= TU)-TI() 


Finding Linear Transformations from Images of Basis Vectors 


We saw in Formula (12) of Section 4.9 that if T: R” — Д" is a matrix transformation, say multiplication by A, and 
if e1, ез, ..., ej are the standard basis vectors for R”, then A can be expressed as 


A= [T(ei)|T(e5)|* * * [7 (е„)] 
It follows from this that the image of any vector v = (сі, c2, --- су) in R” under multiplication by A can be 
expressed as 
T(v) —c47(e1) --c27(e3) + +++ --c47(ey) 
This formula tells us that for a matrix transformation the image of any vector is expressible as a linear combination 
of the images of the standard basis vectors. This is a special case of the following more general result. 


THEOREM 8.1.2 


Let T.V. —› W be a linear transformation, where / is finite dimensional. If 5 = (v4, ¥2, ..., v4) is a basis 


for V, then the image of any vector v in V can be expressed as 
T(v) —eiT(v1) +27 (va) + ^ + + bent (жу) (3) 


where с, c5, ..., Су are the coefficients required to express v as a linear combination of the vectors in S. 


Proof Express vas V = CV, -FC2V2 + * * * ++ C4V3 and use the linearity of T. 


EXAMPLE 10 Computing with Images of Basis Vectors + 


Consider the basis S= (v, v2, v3) for &2, where 
v(—(1,1,1, v2-(1,1,0) тз = (1, 0, 0) 
Let T- R? — д? be the linear transformation for which 
Т(т1) = (1, 0), Tiv)-(2, – 1), Tv3) = (4, 3) 
Find a formula for 7 (х, x2, х3), and then use that formula to compute 7(2, = 3, 5). 


Solution We first need to express x = (x1, x2, x3) as a linear combination of V1, v2, and V3. If we 
write 


(x1, x2, хз) =c1(1, 1, 1) --ea(1, 1, 0) +с3(1, 0, 0) 


then on equating corresponding components, we obtain 


cit 69 63 = Х| 
С1+©2 = X2 
Gi = X3 


which yields С] = 13,02 = X32 — X3, C3 — X1 — X2, so 


(x1,x2,23) = х3(1, 1,1) + (x27 x3)(1, 1, 0) + (x1 — x2) (1, 0, 0) 
= X3Vid (x2—23)v2 + (x1 —x2)v3 
Thus 
T(xi,x2,x3) = xaT(wi) + (x2—zx3)T(v2) + (x1—x2)T(v3) 


x3(1, 0) + (х2 — х3) (2, — 1) + (x1 — х2) (4, 3) 
(Ax, = 2x3 = x3, 3x4 = 4x2 + x3) 


From this formula, we obtain 
T(2, = 3, 5) = (9, 23) 
CALCULUS REQUIRED 


EXAMPLE 11 ALinear Transformation from C'(-s, о) to F(-~, ~) 4 


1 Hes ; ; : "—— 
Let! =C (- oo, оо) be the vector space of functions with continuous first derivatives on ( — со, со), and let 


W = F( — со, со) be the vector space of all real-valued functions defined on ( = со, со). Let D: p — W be the 
transformation that maps a function f = f (x) into its derivative—that is, 


D(f) =f") 
From the properties of differentiation, we have 
D(f--g)— D(kf)—kD(f) and Df) + Dig) 


Thus, D is a linear transformation. 


CALCULUS REQUIRED 


EXAMPLE 12 An Integral Transformation — 


Let = C( — со, со) be the vector space of continuous functions on the interval ( = со, со), let 


1 : . : ius 
W=C | — со, х) be the vector space of functions with continuous first derivatives on ( = со, со), and 


let 7: — Wy be the transformation that maps a function fin V into 
x 
IP= [ roa 


For example, if f (х) = x, then 


x 3| „3 
=f PaaS Е 
0 3 5 
0 
The transformation 7: р —, W is linear, for if 15 any constant, and if fand g are any functions in V, then 
properties of the integral imply that 


J(kf)— | уак] f(Ddt—kJCP) 
0 0 


JU +8) = | (700) +80) | fiat-- | gdt = (У) + JG) 
0 0 0 


Кете! апа Капде 


Recall that if A is an jj; x у matrix, then the null space of A consists of all vectors x in R” such that Ах — 0, and by 
Theorem 4.7.1 the column space of A consists of all vectors b in ™ for which there is at least one vector x in R" 
such that Ах = h. From the viewpoint of matrix transformations, the null space of A consists of all vectors in 2” 
that multiplication by A maps into 0, and the column space of A consists of all vectors in &" that are images of at 
least one vector in E" under multiplication by A. The following definition extends these ideas to general linear 
transformations. 


DEFINITION 2 


If T- —› W is a linear transformation, then the set of vectors in V that T maps into 0 is called the kernel of 
T and is denoted by ker(£). The set of all vectors in W that are images under 7 of at least one vector in V is 
called the range of T and is denoted by R(£). 


EXAMPLE 13 Kernel and Range of a Matrix Transformation + 


If T 4; R" — R” is multiplication by the js x х matrix A, then, as discussed above, the kernel of F 4 is 
the null space of A, and the range of T 4is the column space of A. 


EXAMPLE 14 Kernel and Range of the Zero Transformation + 


Let Т.Р — W be the zero transformation. Since T maps every vector in V into 0, it follows that 
ker(£) = V. Moreover, since 0 is the only image under T of vectors in V, it follows that R(t) = (0) . 


EXAMPLE 15 Kernel and Range of the Identity Operator <4 


Let 7:7 — V be the identity operator. Since /(v) = v for all vectors in V, every vector in V is the image 
of some vector (namely, itself); thus R(7) = F. Since the only vector that J maps into 0 is 0, it follows 
that ker(7) = {0}. 


EXAMPLE 16 Kernel and Range of an Orthogonal Projection — 


As illustrated in Figure 8.1.3a, the points that T maps into 0 = (0, 0, 0) are precisely those on the z-axis, 
so ker(£) is the set of points of the form (0, 0, 2). As illustrated in Figure 8.1.35, T maps the points in g? 
to the xy-plane, where each point in that plane is the image of each point on the vertical line above it. 
Thus, R(£) is the set of points of the form (x, у, 0). 


(0. 0, 0) 


(a) ker(T) is the z-axis. (b) R(T) is the entire xy-plane. 


Figure 8.1.3 


EXAMPLE 17 Kernel and Range of a Rotation — 


Let T: д2 — д? be the linear operator that rotates each vector in the xy-plane through the angle g (Figure 
8.1.4). Since every vector in the xy-plane can be obtained by rotating some vector through the angle 6, it 
follows that R(£) — R?. Moreover, the only vector that rotates into 0 is 0, so ker(£) = {0}. 


Figure 8.1.4 


CALCULUS REQUIRED 


EXAMPLE 18 Kernel of a Differentiation Transformation — 


Let! =C (- e, х) be the vector space of functions with continuous first derivatives on ( — со, со), 


let = F { — со, со) be the vector space of all real-valued functions defined on { — со, со), and let 
D:V — W be the differentiation transformation D (f } = f ' (x). The kernel of D is the set of functions in 


V with derivative zero. From calculus, this is the set of constant functions on ( = co, со). 


Properties of Kernel and Range 


In all of the preceding examples, ker(£) and R(£) turned out to be subspaces. In Example 14, Example 15, and 
Example 17 they were either the zero subspace or the entire vector space. In Example 16 the kernel was a line 
through the origin, and the range was a plane through the origin, both of which are subspaces of 23. АП of this is a 


consequence of the following general theorem. 


THEOREM 8.1.3 


If 7: ¥ — W is a linear transformation, then: 
(a) The kernel of T is a subspace of V. 
(b) The range of T is a subspace of W. 


Proof (а) To show that ker(£) is a subspace, we must show that it contains at least one vector and is closed under 
addition and scalar multiplication. By part (a) of Theorem 8.1.1, the vector 0 is in ker(£), so the kernel contains at 
least one vector. Let V1 and v2 be vectors in ker(£), and let k be any scalar. Then 


Т(у + v3) = T(vi) + T(v2) =0 +0 =0 
so V1 + V2 15 in ker(£). Also, 


Т(у) = kT(vj) 2 X0 =0 


so kv, is in ker(£). 


Proof (b) To show that R(£) is a subspace of W, we must show that it contains at least one vector and is closed 
under addition and scalar multiplication. However, it contains at least the zero vector of W since 7(0) = (0) by 
part (a) of Theorem 8.1.1. To prove that it is closed under addition and scalar multiplication, we must show that if 
W| and W3 are vectors in R(£), and if k is any scalar, then there exist vectors a and b in V for which 


Т(а) =w; +w: and ТБ) = kw; e 


But the fact W1 and м2 are in R(£) tells us that there exist vectors V1 and V2 in V such that 
Т(у) =w; and T(v2) = 


The following computations complete the proof by showing that the vectors a = v, + Vz and b = у satisfy the 
equations in 4: 


F(a) = Tiv -- v3) = T(v4) + Tova) =w] + wo 
T(b) = T(kvi) = XT(v1) = kw] 


CALCULUS REQUIRED 


EXAMPLE 19 Application to Differential Equations + 


Differential equations of the form 
y" 4 wy =0 (о a positive constant) (5) 


arise in the study of vibrations. The set of all solutions of this equation on the interval ( — со, со) is the 


kernel of the linear transformation 2: ct | — со, ox) =} c{ — OO, оо}, given by 


Dy) =y" wy 
It is proved in standard textbooks on differential equations that the kernel is a two-dimensional subspace 
of en | — со, оо), so that if we can find two linearly independent solutions of 5, then all other solutions 


can be expressed as linear combinations of those two. We leave it for you to confirm by differentiating 
that 


yi =coswx and уз = яп шх 
are solutions of 5. These functions are linearly independent since neither is a scalar multiple of the other, 
and thus 


y — €,C08 wx + casin wx (6) 


is a “general solution” of 5 in the sense that every choice of c; and c2 produces a solution, and every 
solution is of this form. 


Rank and Nullity of Linear Transformations 


In Definition 1 of Section 4.8 we defined the notions of rank and nullity for an р x д matrix, and in Theorem 4.8.2, 
which we called the Dimension Theorem, we proved that the sum of the rank and nullity is n. We will show next 
that this result is a special case of a more general result about linear transformations. We start with the following 
definition. 


DEFINITION 3 


Let T-V — W bea linear transformation. If the range of T is finite-dimensional, then its dimension is called 
the rank of T; and if the kernel of T' is finite-dimensional, then its dimension is called the nullity of T. The 
rank of T is denoted by rank (£) and the nullity of T by nullity(£). 


The following theorem, whose proof is optional, generalizes Theorem 4.8.2. 


THEOREM 8.1.4 Dimension Theorem for Linear Transformations 


If 7: ¥ — W is a linear transformation from an n-dimensional vector space V to a vector space W, then 


ranic(£) + nullity(£) = м (7) 


In the special case where A is an jj; x x matrix and T д: R” — R™ is multiplication by A, the kernel of F 4is the null 
space of A, and the range of F 4is the column space of A. Thus, it follows from Theorem 8.1.4 that 
rank(7 д) + nulity( д) = м 
OPTIONAL 


Proof of Theorem 8.1.4 We must show that 


dim(R(£)) + dim(ker(£)) = и 
We will give the proof for the case where 1 < dim(ker(£)) < я. The cases where dim(ker(£)) = 0 and 
dim(ker(£)) = » are left as exercises. Assume dim(ker(£)) = ғ, and let v4, ..., v, be a basis for the kernel. Since 
(v1, ..., Vy} is linearly independent, Theorem 4.5.55 states that there are »; — » vectors, ¥)4.1, ..., Уу, such that the 
extended set (v1,..., Vy, Vy41,---, Vy} is a basis for V. To complete the proof, we will show that the »; — » vectors 
in the set S= (T(vy.11),... Z(v4)) form a basis for the range of T. It will then follow that 
dim(R(£)) + dim(ker(£)) = (и =r) --r—» 


First we show that S spans the range of T. If b is any vector in the range of T, then b = 7’(w) for some vector v in 
V. Since (v1,..., Vy, Vy+1> --- Vy} is a basis for V, the vector v can be written in the form 
V—C|V| ttt OH, PCr Vry E C n Cy 


Since vj, ..., v, lie in the kernel of Т, we have T(v4) = - * - = T(v,) —0,so 


b= 7(у) = cp Twp gi) cc си Yn) 
Thus S spans the range of T. 


Finally, we show that S is a linearly independent set and consequently forms a basis for the range of T. Suppose that 
some linear combination of the vectors in S is zero; that is, 


Т0) ccc nT Yn) =0 (8) 
We must show that буу = +++ = Ж, = 0. Since T is linear, 8 can be rewritten as 
Tiptr Куч) = 0 

which says that &, туф] * © + ++ &»¥y is in the kernel of T. This vector can therefore be written as a linear 
combination of the basis vectors (v1,..., v.) , say 

Куу 7 Куту = рур 7 У, 
Thus, 

kyvyt oeo HAY рун + a kyv, = 0 

Since {%1,..., v4) is linearly independent, all of the k's are zero; in particular, £y, = * * * =k, = 0, which 
completes the proof. 
= ш 


Concept Review 


* Linear transformation 


Linear operator 


Zero transformation 
* Identity operator 


* Contraction 


Dilation 


* Evaluation transformation 


Kernel 


Range 
* Rank 
* Nullity 


Skills 

* Determine whether a function 1s a linear transformation. 

* Find a formula for a linear transformation Т. — y given the values of T on a basis for V. 
* Find a basis for the kernel of a linear transformation. 

• Find a basis for the range of a linear transformation. 

* Find the rank of a linear transformation. 


* Find the nullity of a linear transformation. 


Exercise Set 8.1 


In Exercises 1—8, determine whether the function is a linear transformation. Justify your answer. 
1. 7: — R, where V is an inner product space, and F(u) = ||u||. 

Answer: 

Nonlinear 


2. T: R? — R3, where Vg is a fixed vector in д3 and T(u) = u x vg. 
3. T: M3; — M 33, where B is a fixed 2 x 3 matrix and T( A) = AB. 


Answer: 


Linear 
4. T. Mj, > К, where Т(А) = (А). 
5. F: M yn Му where F (4) =A". 


Answer: 


Linear 


6. T: M5; — R, where 


(a) d: ;]) eed 
c d 


b) „а b]\_ 2,2 
1[: 5) 


7. T: P4 — P3, where 
(a) T (ao J-a4x + ах?) = ag - ai (x + 1) + a3(x + DE 


(b) T (ao + jx + ар?) = (а + 1) + (а! + 1х + (a2 + 1р2 
Answer: 


(a) Linear 
(b) Nonlinear 
8. T. F( = оо, оо) — F( — со, оо), where 
(a) ГОУ (х)) 214 F (х) 
(b) TG (x) = (+1) 
9. Consider the basis S — (v4, v2) for 22, where vj = (1, 1) and v; = (1, 0), and let 7: #2 _, &? be the linear 
operator for which 
T(vi) = (1, —2) and T(vj)—(—4.1) 
Find a formula for Тху, x3), and use that formula to find T(5, = 3). 


Answer: 


Тху, хэ) = (—4x, + 5x9, х= 3х3); T7(5,—3)—(-—55,14) 
10. Consider the basis $= (v4, v2} for 22, where v4 = ( — 2, 1) and v3 = (1, 3), and let т: д2 — дЗ be ће 
linear transformation such that 
T(vi) = (= 1,2,0) and (әз) = (0, = 3, 5) 
Find а formula for (х, х2), and use that formula to find 7(2, — 3). 
11. Consider the basis S = (v, v2, ¥3} for R3, where v4 = (1, 1, 1), уз = (1, 1, 0), and жз = (1, 0, 0), and let 
т. R? _, R? be the linear operator for which 
Т(т1) = (2, – 1,4), T(vv)-(5,0,1), 
Tova) = (= 1, 5, 1) 
Find a formula for T'(x, x3, х3), and use that formula to find 7(2, 4, = 1). 


Answer: 


P(x1, х9, X3) = ( = x1 + 4x23 = X3, 5X1 = 5x3 — xa, X1 + 3х3); 7(2,4, —1) = (15, 28, = 1) 
12. Consider the basis S — (v4, v2, v3} for g2, where vy = (1, 2, 1), v2 = (2, 9, 0), and v3 = (3, 3, 4), and let 
т. R? _, р? be the linear transformation for which 


Tivi) = (1,0), Tiv) =(— 1, 1), T(v3) = (0, 1) 
Find a formula for (х, x3, x3), and use that formula to find 7(7, 13, 7). 
13. Let V1, V2, and ўз be vectors in a vector space V, and let т: р" — R? be a linear transformation for which 
Tivi) = (l, =1,2), Tivi) = (0, 3, 2), 
Т(уз) = (= 3, 1, 2) 
Find 7T(2v, — 3v3 + 4v3). 
Answer: 
T(2v, = 3v2 + 4v3) = (— 10, —7, 6) 
14. Let T: д2 _, д2 be the linear operator given by the formula 
T(x, y) = (2x = y, — 8х + 4у) 
Which of the following vectors are in R(£)? 
(a) (1, 74) 
(b) (5. 9) 
(с) (— 3, 12) 


15. Let T: R? — &? be the linear operator in Exercise 14. Which of the following vectors аге in ker()? 
(а) (5, 10) 
(b) (3,2) 
(с) (1,1) 


Answer: 


(a) 


16. 


17. 


18. 


19. 


20. 


21. 


Let T- д4, R? be the linear transformation given by the formula 
T(xj,X3,X3,X4) = (4x, х9 = 2х3 = 3х4, 
2x1 х9 + X3 = 4х4, бху = 9x3 + 9x4) 
Which of the following are іп R(£)? 
(a) (0, 0, 6) 
(b) (1, 3, 0) 
(c) (2.4. 1) 


Let T: RÍ _, R? be the linear transformation in Exercise 16. Which of the following are in ker(£)? 
(a) (3, —8, 2, 0) 

(b) (9, 9, 0, 1) 

(c) (0, —4, 1, 0) 


Answer: 


(a) 

Let T: P4 — P be the linear transformation defined by T(p(x)) = xp(x). Which of the following are in 
ker(£)? 

(a) x? 

(b) 0 

(с) 1+х 

Let T: P4 — P3 be the linear transformation in Exercise 18. Which of the following аге in А(#)? 

(a) x +z? 

(b) 1+х 

(c) eer 


Answer: 


(a) 

Find a basis for the kernel of 

(a) the linear operator in Exercise 14. 

(b) the linear transformation in Exercise 16. 


(c) the linear transformation in Exercise 18. 


Find a basis for the range of 
(a) the linear operator in Exercise 14. 
(b) the linear transformation in Exercise 16. 


(c) the linear transformation in Exercise 18. 


Answer: 


(a) (1, —4) 
(b) (4.2,6), (1,1,0), (—3, 24,9) 


(c) x, x^, x? 


22. Verify Formula 7 of the dimension theorem for 
(a) the linear operator in Exercise 14. 
(b) the linear transformation in Exercise 16. 


(c) the linear transformation in Exercise 18. 


In Exercises 23-26, let T be multiplication by the matrix A. Find 
(a) a basis for the range of T. 

(b) a basis for the kernel of T. 

(c) the rank and nullity of 7. 

(d) the rank and nullity of A. 


(b) | —14 

19 

11 
(c) Rank(7) — 2, nullity(7) — 1 
(d) Rank(.4) = 2, nullity(.4) = 1 


24. 20 —1 

А=| 40 -2 

20 0 0 

25 А A dqx2 

1 ^25 -* 1.0 

Answer 

(а) |1 0 
Ol |1 

(b) | =1 —4 

=] 2 

1 0 

0 7 


(c) Rank (7) = nulity(7) = 2 
(d) Rank (А) = nullity(.4) = 2 


27. 


28. 


29. 


30. 


31. 


32. 


1 4 5 9 
3 —-2 1 — 
A= 
-] 0-10 -1 
4 7. оз c8 
Describe the kernel and range of 


(a) the orthogonal projection on the xz-plane. 
(b) the orthogonal projection on the yz-plane. 


(c) the orthogonal projection on the plane defined by the equation y — x. 
Answer: 


(a) Kernel: y-axis; range: xz-plane 

(b) Kernel: x-axis; range: yz-plane 

(c) Kernel: the line through the origin perpendicular to the plane y — x; range: plane y — x 
Let V be any vector space, and let 7:7 — P be defined by T(v) = 3v. 


(a) What is the kernel of 7? 
(b) What is the range of 7? 


In each part, use the given information to find the nullity of the linear transformation 7. 
(а) T- д5 — R7 has rank 3. 

(b) T: P4— Рз has rank 1. 

(c) The range of T: gÓ — R? is R3. 

(d) T: Моз — M 35 has rank 3. 


Answer: 


(a) Nullity( 7) = 2 

(b) Nullity( 7) —4 

(c) Nullity( 7) = 5 

(d) Nullity(7) = 1 

Let A bea 7 x 6 matrix such that Ах — Q has only the trivial solution, and let т. 25 _, R7 be multiplication by 
A. Find the rank and nullity of T. 

Let A be a 5 x 7 matrix with rank 4. 

(a) What is the dimension of the solution space of Ах = 0? 

(b) Is 4x = h consistent for all vectors b in g?? Explain. 


Answer: 


(a) 3 
(b) No 


Let T- R? — W be a linear transformation from 53 to any vector space. Give a geometric description of ker(£). 


33. 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


41. 


Let 7-77 — R? be a linear transformation from any vector space to 53. Give a geometric description of R(t). 


Answer: 
A line through the origin, a plane through the origin, the origin only, or all of д? 
Let T- R? — R? be multiplication by 
134 
54 7 
—2 2 0 


(a) Show that the kernel of T is a line through the origin, and find parametric equations for it. 


(b) Show that the range of T is a plane through the origin, and find an equation for it. 


(a) Show that if @1, 22, Ру, and 55 are any scalars, then the formula 
F(x, y) = (aix + Фу, азх + doy) 


defines a linear operator on 27. 


(b) Does the formula F (x, >) = (ах? | biy?, ax? | b») define a linear operator on 22? Explain. 


Answer: 


(b) No 
Let (v4, v3,..., Vy} bea basis for a vector space V, and let T: —› W be a linear transformation. Show that if 
Tivi) = T(¥2) = + + + = T(v,) =0 
then T is the zero transformation. 
Let {¥1, ¥2, -- V4) bea basis for a vector space V, and let 7: ^ р be a linear operator. Show that if 
Т(т1) =v, 7T(v3 —v3.. Tn) =Vn 
then T is the identity transformation on V. 


For a positive integer » > 1, let T: M „p — А be the linear transformation defined by T(A) = tr(.4), where A is 
ап » x х matrix with real entries. Determine the dimension of ker(£). 


Prove: If (v1, v3, ..., v4) 15 a basis for V and w41, w2, ..., Wy are vectors in W, not necessarily distinct, then 
there exists a linear transformation 7. — Jy such that 


Tivi) =з,  T(v3)— wa...  T(v4) — ws 


(Calculus required) Let = C[a, b] be the vector space of functions continuous on [a, 2], and let T: р 
be the transformation defined by 


T(f) =5f (х) 4 af fiat 


Is Та linear operator? 


(Calculus required) Let D: Рз — Рэ be the differentiation transformation D (P) = p' (x), What is the kernel of 
D? 


Answer: 


ker(D) consists of all constant polynomials. 


1 
ш (Calculus required) Let J: P4 — R be the integration transformation (р) = / р(х)ах. What is the kernel 


of J? 


43. (Calculus required) Let V be the vector space of real-valued functions with continuous derivatives of all orders 
on the interval ( — со, oo), and let W = F { — со, со) be the vector space of real-valued functions defined on 
( = со, со). 
(a) Find a linear transformation 7’: p — pF whose kernel is Рз. 


(b) Find a linear transformation 7’: y — WF whose kernel is Py. 


Answer: 
(а) T(x) = / 9 (x) 
(b TS) = Pa) 


44. If A is апу x м matrix, and if the linear system 4x — } is consistent for every vector b in R™, what can you 
say about the range of T 4 R” — R™? 


True-False Exercises 
In parts (a)-(1) determine whether the statement is true or false, and justify your answer. 


(a) If T(cqw4 + c2v3) — c17 (v1) + c27 (v3) for all vectors у and v2 in V and all scalars сј and c2, then Tis a 
linear transformation. 


Answer: 


True 


(b) If v is a nonzero vector in V, then there is exactly one linear transformation 7: / — Wy such that 


T(—v)-— = T(v). 
Answer: 


False 


(c) There is exactly one linear transformation T: y — W for which 7(u + v) = (и — v) for all vectors u and v in 
V. 


Answer: 


True 


(d) If và is a nonzero vector in V, then the formula T(v) = vg + v defines a linear operator on V. 
Answer: 


False 


(e) The kernel of a linear transformation is a vector space. 
Answer: 


True 


(f) The range of a linear transformation is a vector space. 
Answer: 


True 


(g) If T: Pg — M 35 is a linear transformation, then the nullity of T is 3. 
Answer: 


False 
(h) The function T: M 5; — А defined by T(.4) = det A is a linear transformation. 


Answer: 


False 


(i) The linear transformation 7: M 2; — №7 defined by 


13 
Т(А) = A 
(4 E 1 
has rank 1. 
Answer: 
False 
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8.2 Isomorphism 


In this section we will establish a fundamental connection between real finite-dimensional vector spaces and the Euclidean 
space R". This connection is not only important theoretically, but it has practical applications in that it allows us to perform 
vector computations in general vector spaces by working with the vectors in R”. 


One-to-One and Onto 


Although many of the theorems in this text have been concerned exclusively with the vector space E", this is not as limiting 
as it might seem. As we will show, the vector space R” is the “mother” of all real n-dimensional vector spaces in the sense 
that any such space might differ from 2” in the notation used to represent vectors, but not in its algebraic structure. To 
explain what we mean by this, we will need two definitions, the first of which is a generalization of Definition 1 in Section 
4.10. (See Figure 8.2.1). 


DEFINITION 1 


If T.V. — W is a linear transformation from a vector space V to a vector space W, then T is said to be one-to-one if 
T maps distinct vectors in V into distinct vectors in W. 


DEFINITION 2 


ТТ. — Wis a linear transformation from a vector space V to a vector space W, then T is said to be onto (or onto 
W) if every vector in W is the image of at least one vector in V. 


V Ww V Ww i, н” Ww Vv. Ww 
~ -— 
————————X 
———- 9 2 =- ———-————— 
————— e ON —-—---— 
ы =. Range Range 
——————X — a o of T ne of T 
| One-to-one. Distinct | Not one-to-one. There Onto W. Every vector in Not onto W. Not every 
| vectors in V have | exist distinct vectors in W is the image of some vector in W is the image 
| distinct images in W. | V with the same image. | vector in V. of some vector in V. 
Figure 8.2.1 


The following theorem provides a useful way of telling whether a linear transformation is one-to-one by examining its 
kernel. 


THEOREM 8.2.1 


If 7: ¥ — W is a linear transformation, then the following statements are equivalent. 


(a) T is one-to-one. 


(b) ker(£) = (0) . 


Proof (a) = (b) Since T is linear, we know that 7(0) = 0 by Theorem 8.1.1a. Since T is one-to-one, there can be no 
other vectors in V that map into 0, so ker(£) = {0}. 


(b) = (a) Assume that ker(£) = (0) . If u and v are distinct vectors in V, then у — y # 0. This implies that F(u — v) #0, 
for otherwise ker(£) would contain a nonzero vector. Since T is linear, it follows that 


T(u) — T(v) = T(u— v) #0 


so T maps distinct vectors in V into distinct vectors in W and hence is one-to-one. 


In the special case where V is finite-dimensional and T is a linear operator on V, then we can add a third statement to those 
in Theorem 8.2.1. 


THEOREM 8.2.2 


If V is a finite-dimensional vector space, and if 7:7 — р is a linear operator, then the following statements are 
equivalent. 


(a) T is one-to-one. 
(b) ker(£) = {0}. 
(c) Tis onto [i.e., R(£) = V] 


Proof We already know that (a) and (b) are equivalent by Theorem 8.2.1, so it suffices to show that (b) and (c) are 
equivalent. We leave it for you to do this by assuming that ёт) = м and applying Theorem 8.1.4. 


EXAMPLE 1 Dilations and Contractions Are One-to-One and Onto <& 


Show that 1f V is a finite-dimensional vector space and c is any nonzero scalar, then the linear operator 
T:V — у defined by T(v) = су is one-to-one and onto. 


Solution The operator T is onto (and hence one-to-one) for if v is any vector in V then that vector is the 
image of the vector (1 / c)v. 


EXAMPLE 2 Matrix Operators + 


If T 4 R” — R” is the matrix operator Т 4(x) = Ax, then it follows from parts (r) and (s) of Theorem 5.1.6 that 
T 4is one-to-one and onto if and only if А is invertible. 


EXAMPLE 3 Shifting Operators + 


Let V = R™ be the sequence space discussed in Example 3 of Section 4.1, and consider the linear “shifting 
operators" on V defined by 


TA(u1, 42, ..., Uy, ---) = (0, 11, 42, ..., Uy, ---) 
Tahu, 42, -.., Uy, ...) = (42, ti3, is ly, ..-) 


(a) Show that 7 is one-to-one but not onto. 


(b) Show that 75 is onto but not one to one. 


Solution 


(a) The operator Т; is one-to-one because distinct sequences in R™ obviously have distinct images. This 
operator is not onto because no vector in А" maps into the sequence (1, 0, 0, ..., 0, ...), for example. 


(b) The operator T3 is not one-to-one because, for example, the vectors (1, 0, 0, ..., 0, ...) and 
(2, 0, 0,..., 0, ...) both map into (0, 0, 0, ..., 0, ...). This operator is onto because every possible 
sequence of real numbers can be obtained with an appropriate choice of the numbers 3, t3, ..., By, --- 


Why does Example 3 not violate Theorem 8.2.2? 


EXAMPLE 4 Basic Transformations That Are One-to-One and Onto «4 


The linear transformations Тү: P; — R? and Ту: Мр R4 defined by 
Т (a + bx + єх? | ах?) = (a. b, c,d) 


afe - bed 


are both one-to-one and onto (verify by showing that their kernels contain only the zero vector). 


EXAMPLE 5 AOne-to-One Linear Transformation + 


Let T: Py, — P, be the linear transformation 
Т(р) = T(p(x)) = xp(x) 
discussed in Example 5 of Section 8.1. If 
p—p(x)ecg-cqx4cco сух" and q-—4(x9)-—dg-dqixcccdx" 
are distinct polynomials, then they differ in at least one coefficient. Thus, 


T(p]— ex | сух? Е. сух! апа T(a) = а | dix? roost dx” t! 


also differ in at least one coefficient. It follows that T' 1s one-to-one since it maps distinct polynomials p and q 
into distinct polynomials T(p) and T(q). 


CALCULUS REQUIRED 
EXAMPLE 6 ATransformation That Is Not One-to-One + 


Let 
ne! ( — оо, х) => F ( =, х) 


be the differentiation transformation discussed in Example 11 of Section 8.1. This linear transformation is not 
one-to-one because it maps functions that differ by a constant into the same function. For example, 


D(x?) =D(x? | 1)=2x 


Dimension and Linear Transformations 


In the exercises we will ask you to prove the following two important facts about a linear transformation 7. — W in the 
case where V and W are finite-dimensional: 


1. If dim(W) < dim(V), then T cannot be one-to-one. 
2. ат) < dim(W), then T cannot be onto. 


Stated informally, if a linear transformation maps a “bigger” space to a “smaller” space, then some points in the “bigger” 
space must have the same image; and if a linear transformation maps a “smaller” space to a "bigger" space, then there must 
be points in the “bigger” space that are not images of any points in the “smaller” space. 


Remark These observations tell us, for example, that any linear transformation from 27 to p? must map some distinct 
points of g? into the same point in 22, and it also tells us that there is no linear transformation that maps 52 onto all of #3. 


Isomorphism 


Our next definition paves the way for the main result in this section. 


DEFINITION 3 


If a linear transformation 7: [/ —, W is both one-to-one and onto, then T is said to be an isomorphism, and the 
vector spaces V and W are said to be isomorphic. 


The word isomorphic is derived from the Greek words iso, meaning "identical," and morphe, meaning "form." This 
terminology is appropriate because, as we will now explain, isomorphic vector spaces have the same “algebraic form," even 
though they may consist of different kinds of objects. To illustrate this idea, examine Table 1 in which we have shown how 
the isomorphism 


ap + 1х + аз? 1, (20, 21.22) 
matches up vector operations in Рэ and ЁЗ. 
Table 1 
Operation in P5 Operation in Ёз 
3(1—2х 332] = 3 — 6x + 9х2 31, 72,3) = (3, — 6,9) 


оак) (паз) заат | @1, —1)+й, -1,5)= 6,0,4 


Operation in P2 Operation in Ёз 
(4-4 2x-+3x2)— (2-4 30) 26 623-G —4,3)= 0,6,0) 


The following theorem, which is one of the most important results in linear algebra, reveals the fundamental importance of 
the vector space R”. 


THEOREM 8.2.3 


Every real n-dimensional vector space is isomorphic to R”. 


Theorem 8.2.3 tells us that a real n-dimensional vector 
space may differ from R” in notation, but its algebraic 
structure will be the same. 


Proof Let V be a real n-dimensional vector space. To prove that V is isomorphic to &" we must find a linear 
transformation T: V. — R” that is one-to-one and onto. For this purpose, let 


¥1, V2, -- Vy 


be any basis for V, let 
u= үу Ewa уту (1) 


be the representation of a vector u in V as a linear combination of the basis vectors, and define the transformation 


T.V — R” by 
T(u) = (ki, kg, -n ky) (2) 


We will show that T is an isomorphism (linear, one-to-one, and onto). To prove the linearity, let u and v be vectors in V, let 
c be a scalar, and let 


u= уур Hiwat c c c Руту and v=div Hawat * Руту (3) 


be the representations of u and v as linear combinations of the basis vectors. Then it follows from 1 that 
T(cu) = T(ck,w|--ck2v3 t + * + +ckyvy) 

(ор, сЁ, ..., ck) 

c(ky, k3,..., ky) —cT(u) 


and it follows from 2 that 
T(u--v) = T((k,--dji)wi + (k2--d3)v3 t +--+ + (Ky Hadyn) 
(ky +21, ka +da, -n ky Hady) 
= (kiss ky) + (d1, d2 -n dn) 
= (ш) + (ж) 
which shows that T is linear. To show that T is one-to-one, we must show that if и and v are distinct vectors in V, then so are 
their images in R”. But if ғ y, and if the representations of these vectors in terms of the basis vectors are as in 3, then we 


must have k; + dy for at least one i. Thus, 
T(u) = (k1, k2, ..., kn) # (d1, d2, -n dn) = T(v) 
which shows that u and v have distinct images under Т. Finally, the transformation T is onto, for if 
w= (ky, £9, ..., Ky) 
is any vector in E". then it follows from 2 that w is the image under T of the vector 


u=Ayvy Hiva t * * c +kyvy 
Remark Note that the isomorphism 7 in Formula 2 of the foregoing proof is the coordinate map 


п (Ар, k3,..., ky) = (u) s 


that maps u into its coordinate vector with respect to the basis 5 = (v1, v2, ..., V4) . Since there are generally many 
possible bases for a given vector space V, there are generally many possible isomorphisms between V and 2”, one for each 


different basis. 


EXAMPLE 7 The Natural Isomorphism from Pp - 1 to кп 4 


We leave it for you to verify that the mapping 
apax Heo + PT i, (ao. a... an-ı) 
from P, о R” is one-to-one, onto, and linear. This is called the natural isomorphism from Р„_|1о R” 
n-i 


: ; А А 2 А 
because, as the following computations show, it maps ће natural basis { l, X, X“, -a X \ for P. , into the 


standard basis for R”: 


1=140х 40х24 + 40x"! E (1,0,0,..,0) 
x=O+x+0x74 °° + 40x77) Т, (0,1,0,.,0) 
x7) 040x407 +--+ 4x77 £ (0,0,0,.., 1) 


EXAMPLE 8 The Natural Isomorphism from M22 to R^ < 


10 0 1 00 00 
ilo of ао 0 [i of telo i] 


form a basis for the vector space 4M 55 of 2 х 2 matrices. An isomorphism Т. M^» — р“ can be constructed by 
first writing a matrix A in Мул in terms of the basis vectors as 


a, a 1 0 01 0 0 00 
ae [s almeo о ео oj ol ted 1 


and then defining 7 as 


The matrices 


T(A) = (a1, a2, 23, a4) 


1 -3|T 
+, |1, —3,4,6 
HEEL 


More generally, this idea can be used to show that the vector space M mp of р x д matrices with real entries is 
isomorphic to R”™”. 


Thus, for example, 


EXAMPLE 9 Differentiation by Matrix Multiplication + 


Consider the differentiation transformation D: Рз — P^ on the vector space of polynomials of degree three or 
less. If we map Рз and P; into д^ and д?, respectively, by the natural isomorphisms, then the transformation D 


produces a corresponding matrix transformation from 54 to 23. Specifically, the derivative transformation 


3D 


ap + aix + ax? Б азх а] + 2а2х 4 Зазх? 


produces the matrix transformation 


ай z 
0100] І 
0 02 04, |= | 222 
000 3431 [Заз 


Thus, for example, the derivative 


2-02 4x + 4x? d —148x- 3х2 
dx 


can be calculated as the matrix product 


0100 : 1 
0020 4|^ 8 
00 0 3 _1 -3 


This idea is useful for constructing numerical algorithms to perform derivative calculations. 


Inner Product Space Isomorphisms 


In the case where V is a real n-dimensional inner product space, both V and R" have, in addition to their algebraic structure, 
a geometric structure arising from their respective inner products. Thus, it is reasonable to inquire if there exists an 
isomorphism from V to R" that preserves the geometric structure as well as the algebraic structure. For example, we would 
want orthogonal vectors in V to have orthogonal counterparts in R”, and we would want orthonormal sets in V to 
correspond to orthonormal sets in 2”. 


In order for an isomorphism to preserve geometric structure, it obviously has to preserve inner products, since notions of 
length, angle, and orthogonality are all based on the inner product. Thus, if V and W are inner product spaces, then we call 
an isomorphism T: V. — W an inner product space isomorphism if 


{Т(м), T(v)) = (и, v) 


It can be proved that if V is any real n-dimensional inner product space and R” has the Euclidean inner product (the dot 
product), then there exists an inner product space isomorphism from V to R". Under such an isomorphism, the inner 
product space V has the same algebraic and geometric structure as R”. In this sense, every n-dimensional inner product 
space is a “carbon copy” of R" with the Euclidean inner product that differs only in the notation used to represent vectors. 


EXAMPLE 10 An Inner Product Space Isomorphism + 


Let R” be the vector space of real n-tuples in comma-delimited form, let M, be the vector space of real у x 1 
matrices, let R” have the Euclidean inner product (u, v) —u v, and let M have the inner product 


T А ; А 
lu, y) =ч Vin which u and v are expressed in column form. The mapping 7: А” — M „ defined by 
УІ 
Т |У2 
(У1, У2, --- Ум) > : 

Yn 
is an inner product space isomorphism, so the distinction between the inner product space R" and the inner 
product space Af „ is essentially notational, a fact that we have used many times in this text. 


Concept Review 

* One-to-one 

* Onto 

* [somorphism 

* [somorphic vector spaces 

* Natural isomorphism 

* [nner product space isomorphism 

Skills 

* Determine whether a linear transformation is one-to-one. 
* Determine whether a linear transformation is onto. 


* Determine whether a linear transformation is an isomorphism. 


Exercise Set 8.2 


1. In each part, find ker(£), and determine whether the linear transformation T is one-to-one. 
(а) T:R? — R?, where T(x, y) = (y, x) 
(b) T:R? — R?, where T(x, y) = (0, 2x + Зу) 
(c) T: R2 — R3, where T(x, y) = (x +y, x — y) 
(d) T:R? — R3, where T(x, y) = (x, y, x +y) 
(e) T: R? — А3, where T(x, y) = (x—y,y»-—x,2x—2y) 
(f) T.R? — R2, where T(x, y,z) = (x+y +z, x = y —z) 


Answer: 


(a) ker(7) = {0}; Tis one-to-one 


(b) ker (7) = uu 2, Jy T is not one-to-one 


(c) ker(7) = {0}; Tis one-to-one 


(d) Кег(Т) = {0}; Tis one-to-one 
(e) ker(7) = (k(1, 1)) ; Tis not one-to-one 
(f) ker(7) = (&(0, 1, — 1)) ; Tis not one-to-one 


2. Which of the transformations in Exercise 1 are onto? 
3. In each part, determine whether multiplication by A is a one-to-one linear transformation. 
(a) 1 -2 
A=| 2 —4 
-3 6 
(b) 1 357 
А=| 2-124 
-1 300 
(c) 4 —2 
А=|1 5 
5 3 
Answer: 


(a) Not one-to-one 
(b) Not one-to-one 


(c) One-to-one 


A 


. Which of the transformations in Exercise 3 are onto? 


Un 


. As indicated in the accompanying figure, let Т. #2 _, 22 be the orthogonal projection on the line у = x. 
(a) Find the kernel of 7. 


(b) Is T one-to-one? Justify your conclusion. 


Figure Ех-5 


Answer: 


(а) ker(7) = (&( — 1, 1)) 
(b) T is not one-to-one since ker(7) # {0}. 

6. As indicated in the accompanying figure, let т. 22 — &? be the linear operator that reflects each point about the y-axis. 
(a) Find the kernel of 7. 


(b) Is T one-to-one? Justify your conclusion. 


Figure Ex-6 


7. In each part, use the given information to determine whether the linear transformation T is one-to-one. 
(a) T: R" — R”, nullity) = 0 
(b) T:R” — А", rank(f) = ми —1 
(с) T: R" >R”. n cm 
(d) T:R” > R”, Rt) = А" 


Answer: 


(a) Т is one-to-one 
(b) T is not one-to-one 
(c) Т is not one-to-one 
(d) T is one-to-one 

8. In each part, determine whether the linear transformation T is one-to-one. 
(а) T: P4 — P3, where (ау + ах + ау?) = х (ao dax ау?) 
(b) Т.Р — Рз, where T(p(x)) = р(х + 1) 


9. Prove: If V and W аге finite-dimensional vector spaces such that дип (#Ў) < шп (7), then there is по one-to-one linear 
transformation T:  — W. 


10. Prove: There can be an onto linear transformation from V to W only if аит) > dim(W). 
H. (a) Find an isomorphism between the vector space of all 3 x 3 symmetric matrices and gÊ. 
(b) Find two different isomorphisms between the vector space of all 2 x 2 matrices and R4. 
(c) Find an isomorphism between the vector space of all polynomials of degree at most 3 such that p(0) = 0 and p3. 


(d) Find an isomorphism between the vector spaces span ( 1, sin(x), cos(x)) and g?. 


Answer: 
(a) а 
а bc : 
T b a e = d 
c e f e 
7 
(b) 


J 
[аиы 
Гі 
DEN 
& бе 
1 
—— 
I 
Ra & R 
hj 
ao 
Tr —1 
DE 
& oc 
— | 
“e 
| 
ә, б> AR 


PN + bn? H 
Т(ах + х + сх) = |Б 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


(d) | : 
T(a +b sin(x) +e cos(x)) =| b 


[^ 


1 
(Calculus required) Let J: P, — R be the integration transformation J P| = / р(х)ах. Determine whether J is 


one-to-one. Justify your conclusion. 
(Calculus required) Let V be the vector space С! lo. 1] and let T:  —, R be defined by 
T(f)— 7 (0) і 27'(0) ! SP) 
Verify that T is a linear transformation. Determine whether T is one-to-one, and justify your conclusion. 


Answer: 


T is not one-to-one since, for example, ў (х) = x(x — 1)? is in its kernel. 


(Calculus required) Devise a method for using matrix multiplication to differentiate functions in the vector space 
span ( 1, sin(x), cos(x), sin(2x), cos(2x)) . Use your method to find the derivative of 
5—4 sn(x) + sin(2x) + 5 cos(2x). 


Does the formula т{а, b,c | —ax? 4- bx +e define a one-to-one linear transformation from 25 to P4? Explain your 


reasoning. 
Answer: 


Yes; it is one-to-one 


Let E be a fixed 2 x 2 elementary matrix. Does the formula 7'( 4) = E.A define a one-to-one linear operator on M 55? 
Explain your reasoning. 


Let a be a fixed vector in 23. Does the formula T(v) = a x v define a one-to-one linear operator on 23? Explain your 


reasoning. 
Answer: 


T is not one-to-one since, for example a is in its kernel. 


Prove that an inner product space isomorphism preserves angles and distances—that is, the angle between u and v in V 
is equal to the angle between J(u) and 7(v) in W, and ||u — v|| y = ||7(u) — Tv) || yp. 


Does an inner product space isomorphism map orthonormal sets to orthonormal sets? Justify your answer. 
Answer: 


Yes 


Find an inner product space isomorphism between Р; and M 23. 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 


(a) The vector spaces 22 and Р are isomorphic. 


Answer: 


False 


(b) If the kernel of a linear transformation F: Рз — Рз is (0) , then T is an isomorphism. 


Answer: 


True 


(c) Every linear transformation from ÀM 33 to Ро is an isomorphism. 
Answer: 


False 


(d) There is a subspace of M 45 that is isomorphic to Rİ. 
Answer: 


True 


(e) There is a 2 x 2 matrix P such that 7: M25 — Мээ defined by T( 4) = АР — PA is an isomorphism. 
Answer: 


False 


(f) There is a linear transformation T: P4 — Рд such that the kernel of T is isomorphic to the range of T. 
Answer: 


False 
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8.3 Compositions and Inverse Transformations 


In Section 4.10 we discussed compositions and inverses of matrix transformations. In this section we will 
extend some of those ideas to general linear transformations. 


Composition of Linear Transformations 
The following definition extends Formula 1 of Section 4.10 to general linear transformations. 


Note that the word “with” establishes the order 
of the operations in a composition. The 
composition of 75 with T is 

(720 T1)(u) = 72(71 (u)) 
whereas the composition of T with T is 

(71 o 72)(u) = T1(72(0)) 


DEFINITION 1 


If T4: U — V and Ту: — W are linear transformations, then the composition of T with T, 
denoted by 75 o Т (which is read “Ta circle 74"), is the function defined by the formula 


(T20 T1) (u) = T2(T1(w)) (1) 


where u is a vector in U. 


Remark Observe that this definition requires that the domain of 7 (which is V) contain the range of 7. 
This is essential for the formula 75 (7 (u)) to make sense (Figure 8.3.1). 


T, Г, 
u T,(u) TAXT,Qu)) 
U V Ww 


Figure 8.3.1 The composition of Ta with T1. 


Our first theorem shows that the composition of two linear transformations is itself a linear transformation. 


ТНЕОКЕМ 8.3.1 


If T4:U — V and T3:V. — W are linear transformations, then (75 o 71): U — W is also a linear 


transformation. 


Proof If u and v are vectors іп U and c is a scalar, then it follows from 1 and the linearity of 7, and T3 that 


(ToTu +y) = Т2(7 (0+ v)) = 72(71 (a) + T1(v)) 
T3(T1(u)) + T2(71(¥)) 
(T20 Ti) (u) + (720 Т) (ж) 


and 
(Гро Т) (сч) = T3(Ti(cu)) = T3(cT1(u)) 
cT3(T41(u)) = (Тао T1)(u) 


Thus, 72 o Т satisfies the two requirements of a linear transformation. 


EXAMPLE 1 Composition of Linear Transformations + 


Let т: Ру P3 and T5: P4 — P; be the linear transformations given by the formulas 
Ti(p(x)) = хр(х) and Тә(р(х)) = plex +4) 
Then the composition (75 o 71): P, — P3 is given by the formula 
(T20 T) (p(x)) = Т2(Ту(р(х))) = Tap) = (2x + 4)p (2x +4) 
In particular, if p(x} — cg + c1x, then 
(T20 Ti) (p(x)) = (ТоТ) (со +c1x) = (2х +4) (со ++ e1(2x +4)) 


= co (2x + 4) 4- c1(2x + 4)? 


EXAMPLE 2 Composition with the Identity Operator + 


ТТ. — V. is any linear operator, and if 7.” —, р is the identity operator (Example 3 of Section 
8.1), then for all vectors v in V, we have 

(ToD (ә) TU(v)) = Т(у) 

(Zo Т) (ж) i(T(¥)) = T(v) 


It follows that F o 7 and у о T are the same as T; that is, 


Tol=T and loT=T (2) 


As illustrated in Figure 8.3.2, compositions сап be defined for more than two linear transformations. For 
example, if 


ТО, TyV —^—W, and T3: W — Y 


are linear transformations, then the composition 75 o 75 o 7 is defined by 


(730 T20 T1) (u) = 73(72(71(ш))) (3) 


(Т; © T, ° T Xu) 


u Туи) TAT,(u)) TX TT (u))) 
U V Ww Y 


Figure 8.3.2 The composition of three linear transformations. 


Inverse Linear Transformations 


In Theorem 4.10.1 we showed that a matrix operator T д: R” — R” is one-to-one if and only if the matrix A is 
invertible, in which case the inverse operator is T 471. We then showed that if w is the image of a vector x 
under the operator T д, then x is the image under T 47! of the vector w (see Figure 4.10.8). Our next objective 
is to extend the notion of invertibility to general linear transformations. 


Recall that if 7:7 — W is a linear transformation, then the range of T, denoted by R(£), is the subspace of W 
consisting of all images under T of vectors in V. If T 15 one-to-one, then each vector w in K(£) is the image of 
a unique vector v in V. This uniqueness allows us to define a new function, called the inverse of T and 
denoted by 771, that maps w back into v (Figure 8.3.3). 


T 


V Т”! RC) 


Figure 8.3.3 The inverse of T maps 7 (v) back into v. 


It can be proved (Exercise 19) that po R(E) — V. is a linear transformation. Moreover, it follows from the 


definition of 7—! that 


(rjr h= : 


е) : 


so that T and 7—!, when applied in succession in either order, cancel the effect of each other. 


Remark It is important to note that if T: V — WW is a one-to-one linear transformation, then the domain of 
T — is the range of T, where the range may or may not be all of W. However, in the special case where 
T.V — V is a one-to-one linear operator and V is n-dimensional, then it follows from Theorem 8.2.2 that T 
must also be onto, so the domain of T1 is all of V. 


EXAMPLE 3 An Inverse Transformation + 


In Example 5 of Section 8.2 we showed that the linear transformation T: Pp — Ру] given by 
Т(р) = T(p(x)) = xp(x) 

is one-to-one; thus, T has an inverse. In this case the range of T is not all of P, , Бш rather the 

subspace of P, , 4 consisting of polynomials with а zero constant term. This is evident from the 

formula for T: 


T (co Lex cnz” |= сох кеке +++ суд! 
It follows that 77. R(t) — Py, is given by the formula 
qd (сох cix? dtt cnz” t!) = c At s+ сух" 


For example, in the case where » > 3, 


TO (2x =x? + 5x3 4 3х4) =2—х 5x? ++ 3х3 


EXAMPLE 4 Ап Inverse Transformation <4 


Let T- R? — R? be the linear operator defined by the formula 
Тху, x2, х3) = (3x1 + x3, = 2x1 —4x3 + 3x3, 5x1 -- 4x3 — 2x3) 


Determine whether T is one-to-one; if so, find p qu x2, хз). 


Solution It follows from Formula 12 of Section 4.9 that the standard matrix for T is 


3: jJ 0 
Т|=|—2 -4 3 
5 4 -2 
(verify). This matrix is invertible, and from Formula 7 of Section 4.10 the standard matrix for 
ті і 
4-2 —3 
т|=[7]71=|-11 6 9 
=12 7 10 


It follows that 


xi x4 4 —2 —3][х\ 4хр = 2x2 = 3x3 
T!|x2||2|T3||zx2|2|-11 6 9|zx2|2|-lixi + 6x2 + 9x3 
x3 х3 —12 17 10 [х3 —12х| + 7x2 + 10x3 


Expressing this result in horizontal notation yields 


p (x1. x2, x3)= (4x1 = 2x2 — 33, = 11x1 + 6x9 + 9x3, = 12x, + 7x34 10x3) 


Composition of One-To-One Linear Transformations 


The following theorem shows that a composition of one-to-one linear transformations is one-to-one, and it 
relates the inverse of a composition to the inverses of its individual linear transformations. 


THEOREM 8.3.2 


Тү: — V and Ту: — W are one-to-one linear transformations, then 
(а) T4 o Тү is one-to-one. 
@) (ТоТ) = Т o T3. 


Proof (a) We want to show that 75 o T, maps distinct vectors in U into distinct vectors in W. But if u and v 
are distinct vectors in U, then 7 (u) and 74 (є) are distinct vectors in V since 7 is one-to-one. This and the 
fact that 7? is one-to-one imply that 


Tj3(T1(u)) and T3(7T1(v)) 
are also distinct vectors. But these expressions can also be written as 
(T20Tj)(u) and (То 7у)(У) 


so Гао T, maps и and v into distinct vectors in W. 


Proof (b) We want to show that 


VETT EP HN (w) = (тг! a1 Jw) 
for every vector w in the range of Ta o T. For this purpose, let 
u= (75071) d (w) (6) 


so our goal is to show that 


pen) 


But it follows from 6 that 
(T20 Т) (u) =w 
or, equivalently, 
То(Т(и)) =w 
Now, taking T, 1 of each side of this equation, then taking Тү 1 of each side of the result, and then using 4 


yields (verify) 
ser) 
or, equivalently, 


usen 


In words, part (5) of Theorem 8.3.2 states that the inverse of a composition is the composition of the inverses 
in the reverse order. This result can be extended to compositions of three or more linear transformations; for 
example, 


(73072071) =T]! o Tl o 741 (7) 


In the case where T 4, Tp, and Тү are matrix operators on R”, Formula 7 can be written as 
(TcoTgo T =T] оТ! oTo 


or alternatively as 


(Tora) = Т аве (8) 


Note the order of ће subscripts on the two 
sides of Formula 8. 


Concept Review 
* Composition of linear transformations 


* [nverse of a linear transformation 


Skills 

* Find the domain and range of the composition of two linear transformations. 
* Find the composition of two linear transformations. 

* Determine whether a linear transformation has an inverse. 


* Find the inverse of a linear transformation. 


Exercise Set 8.3 


1. Find (75 o T1) (х, у): 
(a) Ty (x, у) = (2х, Зу), T2(x, y) = (х—у,х+у) 
(b) 7; (x, у) = (x — Зу, 0), Talx, y) = (Ax — Sy, Зх — бу) 
(с) Ty (x, у) = (2x, — 3y, x +y) T2(x, у,2) = (x — y, у +2) 
(d) T(x, y) = (x —y, у, x) Tax, y, z) = (0, x +y +2) 


Answer: 


(a) (T2071) (х, у) = (2x — 3y, 2х + Зу) 
(b) (T20 Ti) (х, у) = (4x — 12y, 3x — 9y) 
(c) (T20 Ti) (x, y) = (2x + 3y, x — 2y) 
(d) (725 o Ti) (х, y) = (0, 2x) 
- Find (73 0 Tz 0 T1) (x, у). 
(а) Ty(x, у) = (—2y, 3x, x — 2y) Talx, y, z) = (y, Z, х), Ta(x, y, Z) = (x +Z, y =Z) 


(b) Til, y) =(x+y, у, =x) Talx, y, Z) = (0, x + у +2, Зу), 
73(x, y, Z) = (3x + 2y, 4z = x = 3y) 


N 


w 


. Let Ty: Моз — Rand T5: Ma3 — M 3; be the linear transformations given by 7 (A) = tr( A) and 
T (4) = AT. 
(а) Find (T, o T2) (A), where A= f ] | 
(b) Can you find (75 o 71) (A)? Explain. 


Answer: 


(a) atd 
(b) (T5 o T1) (.4) does not exist since 74 (A) is not a 2 x 2 matrix. 


A 


. Let T4: P4, — Py and 73: Py — P, be the linear operators given by T1(p(x)) = p(x — 1) and 
T3(p(x)) = р(х + 1). Find (71 o T2) (р(х)) and (72 o T1) (px). 

. Let Тү: — ¥ be the dilation 7; (v) = 4v. Find a linear operator 73: — ¥ such that T, o Тә = Гапа 
T2o0T4zl. 


Un 


Answer: 


Tj4(v)— iv 


6. Suppose that the linear transformations T4: P5 — Ру and 75: P3 — Рз are given by the formulas 


Тү(р(х)) = p(x + 1) and 73(p(x)) = xp(x). Find (72 o T: \(a0 + ax azz’), 
7. Let gg(x) be a fixed polynomial of degree m, and define a function T with domain P,, by the formula 
T(p(x)) = pí(gg(x)). Show that T is a linear transformation. 
8. Use the definition of 74 o Тэо T, given by Formula 3 to prove that 
(a) T30 T3 o T, isa linear transformation. 
(b) T30 T20 T1 = (T30 Ta) от. 
(с) T30 ТоТ = Тзо (T20 T4). 
9. Let T: R? — R? be the orthogonal projection of 23 onto the xy-plane. Show that To T — T. 


10. In each part, let T- 22 _, R? be multiplication by A. Determine whether T has an inverse; if so, find 
ТК R 


ШЕ) 


(а) ,_|5 2 
4121] 


(b) A= 6 —3 
4 —2 
(c) A= 47 
l —1 3 
11. In each part, let 7: R? — R? be multiplication by A. Determine whether T has an inverse; if so, find 
xi 
|| 
х3 
(а) 152 
A= 1 2 1 
-1 1 0 
(b) 14 -1 
A= 12 1 
=] 1 0 
(c) 1.0-1 
А=|0 1 1 
] 1.0 
(d) 1 =1 1 
А= |0 2 = 
2 8 


Answer: 


(a) T has no inverse. 


12. 


13. 


14. 


(b) 3 
а 8х1 + 2*2 4*3 
4 zs [eb 1 d. 
T а = gil 5х2 + 4*3 
_3 3 І 
-7x1 + grat 473 
(c) deuda 
= ^1 5х2 + 2*3 
Т х2 |= -1x1 + 3х2 583 
а и 
9х1 + 52—53 
(d) Х| 3x1 + 3х9 = х3 
T| х2 |= —2x, = 2х9 + X3 
x3 


=4x1 = 5x2 + 2x3 


In each part, determine whether the linear operator T: R” — R” is one-to-one; if so, find 
T (n. Eus хп}. 
(а) TEL x2, ..., Хи) = (0, х1, 22, ... Xn-1) 
(b) TEL x2, -o Хи) = (Gn Xni -- X2, X1) 
(с) T(x1, Х2,-. Хи) = (12, X3... Xm 21) 
Let T: R” — R” be the linear operator defined by the formula 
Т(х1, X2, --+ Xn) = (a1x1, 422, ..., nX n) 
where a, ..., ау are constants. 
(a) Under what conditions will T have an inverse? 


(b) Assuming that the conditions determined in part (a) are satisfied, find a formula for 
p (n. Fy Xn}: 


Answer: 


(а) a; #0 югі= 1, 2, 3,.., и 


b) 7-1 EE DC ЕДЕ Ч 215 
(b) 7 (x1, x2, 23, -n Xyp) = reat a; ag 5 dcm) 


Let T,: R? —› R? and T: д? — д? be the linear operators given by the formulas 
Til, y) = Gb y,x—y) and T(x, у) = (Zx +y, x — 2y) 


(a) Show that 7, and T3 are one-to-one. 
(b) Find formulas for 


TT (x.y), 7" (x.y), (T2071) ! (x. у) 


(c) Verify that (T5 o Т) | = T7! o 771. 


15. Let 74: P4 — P3 and T3: Рз — Рз be the linear transformations given by the formulas 
Тубр(х)) = хр(х) and Ту(р(х))= р(х 1) 
(a) Find formulas for Т (p(x)). i (p(x)), and (T; o T) | (p(x))- 
(b) Verify that (T5 o Т) = Т ò ^ 


Answer: 


@ TH (р(х) = 2, T! (p) = р(х —1); (Тоо Тр) 9G) = Їр(х—1) 
16. Let T 4 R? — R5, Tg: R? — RP, and To; R? — R? be the reflections about the xy-plane, the xz-plane, and 
the yz-plane, respectively. Verify Formula 8 for these linear operators. 
17. Let Т; Ру — А? be the function defined by the formula 
T(p(x)) = (р(0), Р(1)) 
(a) Find 7(1 — 2x). 
(b) Show that Tis a linear transformation. 


(c) Show that T is one-to-one. 


(à Find T (2, 3) and sketch its graph. 


Answer: 


(a) (1, = 1) 
(d T-1(2,3) 22--x 


18. Let T- &? _, 52 be the linear operator given by the formula T(x, y) = (x + ky, = y). Show that T is 
one-to-one and that 7—! — т for every real value of k. 
19. Prove: If T-/ — W is a one-to-one linear transformation, then T: R(t) — V is a one-to-one linear 


transformation. 


In Exercises 20-21, determine whether Т o 75 = 75 o T1. 


20. (a) Тү: R? _, д2 is the orthogonal projection on the x-axis, and Т»: R? _, д2 is the orthogonal projection 
on the y-axis. 
(b) 7i: R? _, д2 is the rotation about the origin through an angle 8, and Т»: R? — д2 is the rotation 
about the origin through an angle 85. 
(с) Ту: — R? is the rotation about the x-axis through an angle #1, and T: R? — R? is the rotation 
about the z-axis through an angle 8^. 


21. (а) T,: R — R? is the reflection about the x-axis, and Т): R? — RŽ is the reflection about the y-axis. 


(b) T,:R? — R? is the orthogonal projection on the x-axis, and T^: R? — д2 is the counterclockwise 
rotation through an angle 9. 


22. 


23. 


(с) T,:R? — R? is a dilation by a factor k, and T^: R? — R? is the counterclockwise rotation about the 
z-axis through an angle f. 


Answer: 


(a) 710 72— T20T| 

(b) 7107227207] 

(с) ло 72 = Гао Г] 

(Calculus required) Let 
x 

ofe) = f'(x) and | - | f (fat 

0 

be the linear transformations in Examples 11 and 12 of Section 8.1. Find (J o D) (f) for 

(a) f(x) 2x? -- 3x 4-2 

(b) F(x) — sn x 

(с) f(x) =e" +3 


(Calculus required) The Fundamental Theorem of Calculus implies that integration and differentiation 
reverse the actions of each other. Define a transformation D: P, — P, 4 by D(p(x) ) = p' (x), and 


define J: Р, — P, by 
x 
ро) =] ptt 


(a) Show that D and J are linear transformations. 


(b) Explain why J is not the inverse transformation of D. 


(c) Can the domains and/or codomains of D and J be restricted so they are inverse linear transformations? 


True-False Exercises 


In parts (a)-(f) determine whether the statement is true or false, and justify your answer. 


(a) The composition of two linear transformations is also a linear transformation. 


Answer: 


True 


(b) If 7: — V and Ту: — V are any two linear operators, then Ту o 75 = 75 o T1. 


Answer: 


False 


(c) The inverse of a linear transformation is a linear transformation. 


Answer: 


False 


(d) If a linear transformation Т has an inverse, then the kernel of T is the zero subspace. 
Answer: 


True 
(e) If F: 82 — R? is the orthogonal projection onto the x-axis, then 771-22 _, д2 maps each point on the 


x-axis onto a line that is perpendicular to the x-axis. 
Answer: 


False 


(f) I£ 741: U — V and Ту: — W are linear transformations, and if 7 is not one-to-one, then neither is 
ТоТ. 


Answer: 


True 
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8.4 Matrices for General Linear Transformations 


In this section we will show that a general linear transformation from any n-dimensional vector space V to any 
m-dimensional vector space W can be performed using an appropriate matrix transformation from R” to R™. This idea is 
used in computer computations since computers are well suited for performing matrix computations. 


Matrices of Linear Transformations 


Suppose that V is an n-dimensional vector space, W is an m-dimensional vector space, and that 7: 7 —, }# is a linear 
transformation. Suppose further that B is a basis for V, that 8" is a basis for W, and that for each vector x in V, the 
coordinate matrices for x and T(x) are [x] рапа [7(х) ] g’, respectively (Figure 8.4.1). 


T 


A vector x » Tix) А vector 
in V { in W 
| (n-dimensional) (m-dimensional) 
A vector A vector 
«ul ^". т 
in R Ix], [7х], in R 
Figure 8.4.1 


It will be our goal to find an x x »; matrix A such that multiplication by A maps the vector [x] р into the vector [T(x) ] g' 
for each x in V (Figure 8.4.2a). If we can do so, then, as illustrated in Figure 8.4.2 b, we will be able to execute the linear 
transformation T by using matrix multiplication and the following indirect procedure: 


Finding T (x) Indirectly 


Step 1. Compute the coordinate vector [x] р. 
Step 2. Multiply [x] g on the left by A to produce [7(x)] pr. 


Step 3. Reconstruct T(x) from its coordinate vector [ F(x) ] pr. 


T maps 
V into W 
x 1 > Tix) x L—À » Tix) 
computation 
| | " в) 
[ | Multiply by A 
Ix], A Tix)ls. (x], ———————— —» (131, 


(2) 


Multiplication 
by A 
maps R” into R 


т 


(а) (b) 
Figure 8.4.2 


The key to executing this plan is to find an ру x »; matrix A with the property that 
A[x] p= [TG] g' (1) 


For this purpose, let B = (цу, uz, ..., uj) be a basis for the n-dimensional space V and 8' = ivi. V2, -a Vm} a basis for 
the m-dimensional space W. Since Equation 1 must hold for all vectors in V, it must hold, in particular, for the basis 
vectors in B; that is, 


Alu] g= [70ш)]в'. Alu] g= [7(u2)] g^... Alun) g= [Tu] 2 Q) 
But 
1 0 
0 1 0 
[u]g—|0 |. [uzle=/o}.--- [wl1g-|0 
0 0 1 
so 
1 
011 d12 ... |n 0 411 
43] 422 ... a2 421 
A[m]g =|; .; E 0о|=| ; 
Gm] Am2 Cmn n Gm] 
011 d12 41у : 412 
431 422 az 472 
дав = |21 72 - ан ell % 
Cm] Am2 --- Cmn А Am2 
211 d12 ... jn ; 41и 
43] 422 ... d2 а2 
Alun]g -|. : "По [= |" 
Gm] Am2 ... Cmn i Amn 
Substituting these results into 2 yields 
&1 412 41у 
221 22 22 
2 тш) ве, |^2 |=1Т(фад1в..„ |" |= Zn)" 
aml 42 mn 


which shows that the successive columns of A are the coordinate vectors of 
Тш), 7(02), .., Fn) 
with respect to the basis 8’. Thus, the matrix А that completes the link in Figure 8.4.24 is 


A= [[70u))] g'|[ 092) 1 g'L--|[ 72) ] e] (3) 


We will call this the matrix for T relative to the bases В and B and will denote it by the symbol [7] g', p. Using this 
notation, Formula 3 can be written as 


[7]g' g— (17001) 1 g'|D 002) 1 g']--|[ 7052] e] (4) 
and from 1, this matrix has the property 
[7] g' glxi p = IT (x)] g' (5) 


We leave it as an exercise to show that in the special case where 7 4: R” — R™ is multiplication by A, and where B and д! 
are the standard bases for R” and R™, respectively, then 


[71g' g— А (6) 


Remark Observe that in the notation [7] д' в the right subscript is a basis for the domain of T, and the left subscript is 


a basis for the image space of T (Figure 8.4.3). Moreover, observe how the subscript B seems to “cancel out" in Formula 
5 (Figure 8.4.4). 


171, , 


Basis for the Basis for the 
image space domain 


Figure 8.4.3 


[7]. axla = (TO), 


Cancellation 


Figure 8.4.4 


EXAMPLE 1 Matrix for a Linear Transformation «4 


Let T: P4 — Ру be the linear transformation defined by 
T(p(x)) 2 xp(x) 
Find the matrix for T with respect to the standard bases 
В= fui, uz} and B'— ivi. v3, vil 
where 


uj=1, u;—z; vw=1, v=x, уз = x? 


Solution From the given formula for Т we obtain 
Т(ш) = T(1) = G)(D0 =x 


Т(ш) = T(x) = (х)(х) =x? 


By inspection, the coordinate vectors for 7'(u;) and 7'(из) relative to B are 


0 0 


[7(ш)]в'= |1), [Ta] =] 0 
0 1 
Thus, the matrix for T with respect to B and B" is 
0 0 
[T]g'g— [T] g/[7(u2)] g'] 21 1 0 
0 1 


EXAMPLE 2 The Three-Step Procedure <4 


Let Т. Ру — P be the linear transformation in Example 1, and use the three-step procedure described in 
the following figure to perform the computation 


T (a + bx | = |а + bx) —ax + bx? 


Direct н 
x : Tix) 
computation 


(1 | i 
Multiply by [7]. g 


[ly — — a, — (TO 


Solution 


Step 1. The coordinate matrix for x = æ + bx relative to the basis B= (1, x} is 


Step 2. Multiplying [x] p by the matrix [7] g' в found in Example 1 we obtain 


о = 
[7]в'віх]в= |1 0 = |а |= [7(х)] в 
o aie ра 


Step 3. Reconstructing T(x) = T(a + bx) from [T(x)] g' we obtain 
T(a-- bx)— 0 +ах + bx? = ax + bx? 


Although Example 2 is simple, the procedure that it 
illustrates is applicable to problems of great 
complexity. 


EXAMPLE 3 Matrix for a Linear Transformation — 


Let T: #2 _, R? be the linear transformation defined by 
^2 0 1 


x x 
ж |)- —5х1+13х2|=|—5 13 М 
—7?x1 + 16x2 —? 16 


Find the matrix for the transformation T with respect to the bases 8 = (щу, из) for д2 and 
В'= fvi, v2, уз} for 22, where 


3 5] 1 —1 0 
H ЕЯ; үү = 0|, ж = 2|], жз = |1 
—1 2 2 
Solution From the formula for T, 
1 2 
T(uj)—-|-2|, Tíu) = 1 
—5 -3 


Expressing these vectors as linear combinations of ¥1, ¥2, and ¥3, we obtain (verify) 
T(uj) =v; —2v3, T(u2) = 3v] + v2 — v3 


Thus, 
1 3 
[TQu)]g =] 0| [T(w2l]g-| 1 
—2 —1 
50 
1 3 
[7]g'g— [[7(tu)]g'|[702]g] 2| 0 1 
=2 =] 


Remark Example 3 illustrates that a fixed linear transformation generally has multiple representations, each depending 
on the bases chosen. In this case the matrices 


0 1 1 3 
T|2|-5 13 and [T]g'g— 0 1 
—7 16 -2 -1 


both represent the transformation Т, the first relative to the standard bases for 22 and 25, the second relative to the bases 


B and д! stated in the example. 


Matrices of Linear Operators 


In the special case where [^ — WW (so that T- 7” — [/ is a linear operator), it is usual to take B = B" when constructing a 
matrix for T. In this case the resulting matrix is called the matrix for T relative to the basis B and is usually denoted by 
[7] g rather than [7] g g. If B= (uj, ug, ..., uy} , then Formulas 4 and 5 become 


Phrased informally, Formulas 7 and 8 state that the 
matrix for T, when multiplied by the coordinate 
vector for x, produces the coordinate vector for T(x) 


[7]g— (0700127027092) 1 вр (2769) 18] (7) 


[T]g[x]g — [TG0]g (8) 


In the special case where T: R” — R” is a matrix operator, say multiplication by A, and B is the standard basis for R”, 
then Formula 7 simplifies to 


[T]g— 4 (9) 


Matrices of Identity Operators 


Recall that the identity operator 7:7 — / maps every vector in V into itself, that is, /(x) = x for every vector x in [/. The 
following example shows that if V is n-dimensional, then the matrix for Z relative to any basis B for V is the у; x x identity 
matrix. 


EXAMPLE 4 Matrices of Identity Operators + 


If 5 = (uj, u5,.., Чу} is a basis for a finite-dimensional vector space Į, and if 7:7 р is the identity 
operator оп ү”, then 


Iu) =u, 1(03) = 02,.., J(u,)—u, 
Therefore, 
P'O i0 
01... 0 
[Пв=|0 0... of=/ 
00... 1 


1 1 1 
apie (ale [21g 


EXAMPLE 5 Linear Operator on P2 + 


Let T: P4 — P^ be the linear operator defined by 
T(p(x)) = p(3x — 5) 
that is, T (co + сух = сах?) =с0 6] (3x — 5) + e3(3x — 5)2. 


(a) Find [T] р relative to the basis 8 = П, x, x 
(5) Use the indirect procedure to compute чү + 2x + 3x7), 


(€) Check the result in (b) by computing т{1 + 2x 4 3x7} directly. 


Solution 


(a) From the formula for T, 
— = 2\ _ 2-44 
r(1]- 1, T(x) = 3x —5, T(z )= @-5) = 9x° = 30x + 25 


50 


1 —5 25 
[7(1)]в= |0 |, Tos] 3l. al —30 
0 0 9 
Thus, 
1—5 25 
[T]p=|0 3 —30 
0 0 9 
(b)  Stepi. The coordinate matrix for p = 1 + 2x + 3x? relative to the basis 5 = { 1,x,x " is 
1 
[Р]в= |2 
3 
Step 2. Multiplying [p] g by the matrix [7] p found in part (a) we obtain 
1 -5 25]||1 66 
[7]в[Р]в=|0 3 -30/|2|—|-84|— [7(p)]g 
0 0 9 || 3 27 


Step 3. Reconstructing т{р) = т{1 + 2x + 3x 2) from [T(p)] g we obtain 
т{1 4 2x + 3x7} = 66 — 84x + 27x? 
(c) By direct computation, 
ті + 2x + 3x7) =1+4 2 (3x - 5} + 3(3x — 5)? 


= 1+ 6x — 10 + 27x? — 90x +75 
= 66 — 84x + 27x? 
which agrees with the result in (b). 


Matrices of Compositions and Inverse Transformations 


We will conclude this section by mentioning two theorems without proof that are generalizations of Formulas 4 and 7 of 
Section 4.10. 


THEOREM 8.4.1 


If T: U — V and Ту: — W are linear transformations, and if B, 8", and 8' are bases for U, V, and W, 
respectively, then 


[T20 Ti] в'в= [72] g' g"[Ti] 2" в (10) 


THEOREM 8.4.2 


If T.V. — р isa linear operator, and if В is a basis for V, then the following are equivalent. 
(a) T is one-to-one. 
(b) [T] pis invertible. 


Moreover, when these equivalent conditions hold, 


[z+] = (11) 


Remark In 10, observe how the interior subscript B" (the basis for the intermediate space V) seems to “cancel out,” 
leaving only the bases for the domain and image space of the composition as subscripts (Figure 8.4.5). This cancellation 


of interior subscripts suggests the following extension of Formula 10 to compositions of three linear transformations 
(Figure 8.4.6): 


[2730 707] р'в= [Тз]в'\в"[Тэ]в"в"[Т1]в"в (12) 


7, • un = [T]; a [Ta в 


Cancellation Cancellation 


Figure 8.4.5 


Т, T, T3 


Basis В Basis В” Basis 8" Basis В' 


Figure 8.4.6 


The following example illustrates Theorem 8.4.1. 


EXAMPLE 6 Composition <4 


Let T4: Ру — Рә be the linear transformation defined by 
Ti(p(x)) =хр(х) 
and let 75: P4 — Рэ be the linear operator defined by 
Тә(р(х)) = p(3x — 5) 
Then the composition (75 o 71): P, — P3 is given by 


(72o Ti) (p(x)) = Т2(7(р(х))) = Тә(хр(х)) = (3х — 5) p(3x — 5) 
Thus, if p(x) = cg + сух, then 
(Тоо Ту) (ср сух) = (3x = 5) (eg +0¢1(3x = 5)) 


= co(3z — 5) -+e1@x—5)? ыз) 


In this example, Р plays the role of U in Theorem 8.4.1, and Р» plays the roles of both V and W; thus we can 
take 8' = B" in 10 so that the formula simplifies to 


[72o Ti] g', p — [72] g' [T1] g',n (14) 


Let us choose 2 = (1, x} to be the basis for P, and choose B'= П, x, x to be the basis for Рэ. We 


showed in Examples 1 and 5 that 


0 0 1-5 25 
[Ti]g.g- | 3 0| and [Ta]pr=|0 3 —30 
0 1 0 0 9 
Thus, it follows from 14 that 
1-5 25)/0 0 —5 25 
[T2071] g=|0 3 -30||1 0|2| 3 —30 (15) 
0 0 910 1 0 9 


As a check, we will calculate [72 o 71] в' в directly from Formula 4. Since B= (1, x}, it follows from 
Formula 4 with uj = 1 and u5 = x that 


[72o Т\]в'в=[[(72о7Т)(1)]в'|[(72о 71) (х)] р] (16) 
Using 13 yields 


ТоТ M1] 93x — 5 and (ТоТ; \(x) = (3x — 5)? = 9x? — 30x + 25 
(nent) (nen) 


From this and the fact that 9 = E Х,Х ay it follows that 


-3 25 
[(72оТ)(1)]в'=| 3| and [(T20 T) G2]g' =| —30 
0 9 
Substituting in 16 yields 
-5 25 
[72o07i]g!g—- | 3 —30 
0 9 


which agrees with 15. 


Concept Review 
* Matrix for a linear transformation relative to bases 
* Matrix for a linear operator relative to a basis 


* The three-step procedure for finding T(x) 
Skills 


* Find the matrix for a linear transformation 7: [7 — W relative to bases of V and W. 


* Fora linear transformation T: I/ —, W find T(x) using the matrix for Т relative to bases of V and W. 


Exercise Set 8.4 


1. Let T: P5 — Ра be the linear transformation defined by T(p(x)) = xp(x). 
(a) Find the matrix for T relative to the standard bases 
B= fui, uz, uz} and B'— ivi. V2, V3, v4} 
where 


=], u =x, u; — x? 


v=], v3—z, v3 — х2, v4— x? 


(b) Verify that the matrix [T] g' в obtained in part (a) satisfies Formula 5 for every vector x = eg cix + сэх? іп P3 


2. Let T: P4 — Ру be the linear transformation defined by 
та fax ар?) = (ao | ai)- (2a; | 3а2 х 


(а) Find the matrix for T relative to the standard bases 8 = { l, x, x i and BY = { lx i for Pa and F4. 
(b) Verify that the matrix [T] g'g obtained in part (a) satisfies Formula 5 for every vector x = cg + сүх + c3x? in P2 
3. Let T: P4 — P be the linear operator defined by 
Tan + ах + ах?) —ag--a (x = 1) + a(x = 1)? 


(а) Find the matrix for T relative to the standard basis 8 = fi, X, X H for Рэ. 


(b) Verify that the matrix [77] р obtained in part (a) satisfies Formula 8 for every vector x = ag + aix + aax? in P^. 


Answer: 


(а) |1 —1 1 
0 1 -2 
0 0 1 


4. Let T: 82 — р? be the linear operator defined by 


апа let 8= (uj, uz} be the basis for which 


(a) Find [T] р. 
(b) Verify that Formula 8 holds for every vector x in 22. 


5. Let т. д2 _, R? be defined by 


(a) Find the matrix [T] в' g relative to the bases 2 = (uj, шз) and 5" = ivi. V2, уз}, where 


=} [3 


(b) Verify that Formula 5 holds for every vector in 22. 


Answer: 
(a) 0 0 
1 
=s 1 
8 4 
3 3 


6. Let T: R? — R? be the linear operator defined by 
P(x, x2, х3) = (1 7x2, 32 7x1, x1 — 23) 


(a) Find the matrix for T with respect to the basis 8 = (v4, уз, v3) , where 
vi (1,0, D, v2= (0,1,1), v3= (1, 1, 0) 


(b) Verify that Formula 8 holds for every vector x = (x1, x2, x3) in RŽ. 


c) Is T one-to-one? If so, find the matrix of T1 with respect to the basis В. 
T p 


7. Let T: P4 — Рз be the linear operator defined by T(p(x)) = р(2х + 1), that is, 
T (co + cix + сах?) =с0 +6] (2х + 1) + с2(2х + 1)2 


(à) Find [7] g with respect to the basis 2 = П, x, x 


(b) Use the three-step procedure illustrated in Example 2 to compute? (2 — 3x 4x A; 


(с) Check the result obtained in part (b) by computing т(2 = 3х + 4x 2) directly. 


Answer: 

Ll 1:1 

024 

004 
(b) 3 + 10x + 16x? 

8. Let T: P4 — Рз be the linear transformation defined by T(p(x)) = xp(x — 3), that is, 

T (co + cix + сах?) = х (co +61 | — 3) + сз(х = 3) 
(а) Find [7] p" i B= |1, x xd and Bl = $1, x, х2, х? 
Find [7] g' g relative to the bases >X, X" è and ,X,X°, XB, 

(b) Use the three-step procedure illustrated in Example 2 to compute т{1 +х=х | 
(с) Check the result obtained in part (b) by computing т{1 +х=х р directly. 


е Let v; = H and v; — Lil and let 


123 
А= 
[-2 5] 
be the matrix for T: &2 _, 82 relative to the basis B= (v1, v2}. 


(a) Find [T(v1)] gand [7(v7)] s. 
(b) Find 7(v4) and 7(v4). 


(c) Find a formula for [| : 


(d) Use the formula obtained in (c) to compute al | | 


Answer: 


9 тов 5| їТ®д1в= [5] 
9 rep -| 5| re»- [55] 


29 
(c) 


H 


10. 3 —2 1 0 
LetA—| 1 6 2 l|bethe matrix for 7: R4 —, R? relative to the bases B = (v, v2, v3, уд) and 
-3 071 
B= fwi, №2, w3}, where 
0 2 1 6 
€ 1 M 1 HE 4 A 9 
1 —1 -1" 4 
1 —1 2 2 
0 -7 —6 
му = | 8 w2—| 8| wa=| 9 
8 1 1 


(a) Find [7 (1) ] g^ [Tv] p^ [7(v3)] pr. and [7(v4)] в". 
(b) Find T(v4), T(v3), T(v3), and T(v4). 


(c) X1 
x2 
Find a formula for T 
x3 
X4 
(d) 2 
Use the formula obtained in (c) to compute 7 ^ 
0 
11. 1 3 -1 
Let4—|2 0 5 | be the matrix for T: Ру — Ру with respect to the basis В = (vj, v2, v3) , where 
6 —2 4 


vi = 3x + 3х2, v4— —1- 3x 2x2 v3 = 3 6 7х 2x7. 


Find [7(v1)] g. [7(v2)] р and [7(v3)] в. 
(a) Find T(v4), (узу), and T(v3). 
(b) Find a formula for T (ao + ax + ay). 


(€) Use the formula obtained in (c) to compute т{1 +x *k 


Answer: 


(a) 1 3 —1 
[7(vi)] g2|2|. [Т(®)]в=| 0 |, [Т(уз)]г=| 5 
6 —2 4 

(b) T(vi) = 16+ 51х + 19x?, Т(у) = — 6 — 5x + 5х2, Т(уз) = 7 + 40x + 15x? 


(c) T(ao + aix аз?) = d + 289a5 + eun Ha + 24742 d ia + 10722 x2 


(d) т{1 + х?) = 22 + 56x + 14x? 


12. Let 74: P, — P3 be the linear transformation defined by 
Тү(р(х)) —xp(x) 
and let 75: P4 — Рэ be the linear operator defined by 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


Тз(р(х)) = р(2х + 1) 
Let 5 — (1, x) and B'= П, X, xl be the standard bases for P, and #3. 
(a) Find [72071] g' p, [72] gt and [71] g' p. 
(b) State a formula relating the matrices in part (a). 


(c) Verify that the matrices in part (a) satisfy the formula you stated in part(b). 


Let 71:1 — Р be the linear transformation defined by 
T1(cg + cix) = 260 — 361х 
and let 75: P4 — Рз be the linear transformation defined by 


Tj (со + сух + сах?) = Bepx | Зсух2 | 3сэх? 
Let B— (1, x} ,g" = П, x, x7 and 2' = П.х, х2, xh. 


(a) Find [735 o Ti ] B' B, [73] B' B", and [71 ] B" B. 
(b) State a formula relating the matrices in part (a). 


(c) Verify that the matrices in part (a) satisfy the formula you stated in part(b). 


Answer: 
JEN ева 
[T20 T1]g'! g— [ [T2]g' p" = , [Ti]g"5— 0 —3 
0 -9 03 0 "An 
0 0 005 


(b) [72e T1] g 57 [72] g'g" [T1] p", 
Show that if 7: y — }¥ is the zero transformation, then the matrix for T with respect to any bases for V and Wis a zero 
matrix. 


Show that if T: 7 — F is a contraction or a dilation of V (Example 4) of Section 8.1), then the matrix for T relative to 
any basis for V is a positive scalar multiple of the identity matrix. 


Let В = (v4, v3, v3, v4) bea basis for a vector space V. Find the matrix with respect to B of the linear operator 
T.V — V defined by T(v1) = v3, T(v2) = v3, T(v3) = v4, T(v4) = v1. 

Prove that if B and 8° are the standard bases for R” and R™, respectively, then the matrix for a linear transformation 
T: R” — R™ relative to the bases B and 8" is the standard matrix for T. 


(Calculus required) Let D: P4 — P be the differentiation operator D (P) = p'(x).In parts (a) and (b), find the 
matrix of D relative to the basis 9 = (pi, рз, рз). 

(а) ру = 1, рз =х, pax? 

(b) p, 22, pp =2— 3x, p322 — 3х + 8x? 

(€) Use the matrix in part (a) to compute рв — 6x + 24x 2 |. 

(d) Repeat the directions for part (c) for the matrix in part (b). 


(Calculus required) In each part, suppose that 5 = (f, Ёз, Ёз) is a basis for a subspace V of the vector space of 
real-valued functions defined on the real line. Find the matrix with respect to B for differentiation operator D: V — V. 
(a) Ё1 = 1, Ёз = sn x, Ёз = соѕ х 

(b) Е; = 1, #9 =е?, з = е2 

(с) Е =e", fo =xe", Ғз = x79? 


20. 


21. 


(d) Use the matrix in part (c) to compute D{4e + 6xe™ — 10x7e* | 


Answer: 
(ә |9 O 0 
00 —1 
0 1 0 
0) 10 0 0 
010 
002 
(с) |2 1 0 
022 
002 
(d) 210 4 14 
14e” — Bre” — 20x7e7* since| 0 2 2 S| =| —8 
0 0 211-10 —20 


Let V be a four-dimensional vector space with basis B, let W be a seven-dimensional vector space with basis 3", and let 
Т.У — W be a linear transformation. Identify the four vector spaces that contain the vectors at the corners of the 
accompanying diagram. 


Direct . 
- » nx) 
computation 


x 
(1 | fe ) 
Multiply by [7'],. 5 


Id — „у= Ils 


Figure Ex-20 


In each part, fill in the missing part of the equation. 
(а) [72o0 Ti] g' g= [72] 2- [71] a". 
(b) [T30 T20 T1] g' p— [73] 4 [72] g" gv[T1] 2" в 


Answer: 


(a) B', p" 
(b) B', Bu 


True-False Exercises 


In parts (a)-(e) determine whether the statement is true or false, and justify your answer. 


(a) If the matrix of a linear transformation 7: — Wy relative to some bases of V and W is [ | then there is a nonzero 


vector x in V such that T(x) = 2x. 
Answer: 


False 


(b) If the matrix of a linear transformation Т. —, jy relative to bases for V and W is f | then there is a nonzero 
vector x in V such that T(x) — 4x. 
Answer: 
False 


(c) If the matrix of a linear transformation 7: [7 — } relative to certain bases for V and W is E 1 then T is one-to-one. 


3 


Answer: 


True 


(d) If SF — F and T: V. — V are linear operators and В is a basis for V, then the matrix of S o T relative to В is 
[T] g[5] p. 
Answer: 
False 

(e) If 7-7 — P is an invertible linear operator and B is a basis for V, then the matrix for P~! relative to B is [7] P ; 


Answer: 


True 
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8.5 Similarity 


The matrix for a linear operator Т: VV depends оп the basis selected for V. One of the fundamental problems of linear 
algebra is to choose a basis for V that makes the matrix for Т as simple as possible—a diagonal or a triangular matrix, for 
example. In this section we will study this problem. 


Simple Matrices for Linear Operators 


Standard bases do not necessarily produce the simplest matrices for linear operators. For example, consider the matrix 
operator T: #2 — 82 whose standard matrix is 
1 1 
T|z 


and view [7] as the matrix for Т relative to the standard basis B = (е, ез) for 22. Let us compare this to the matrix for 
: . p! PO : : 
T relative to the basis 2 = {uj , u} for р? in which 


1 1 
ч -ll = |;| Q) 


Since 
it follows that 


so the matrix for T relative to the basis 8' is 


[Т]в' = [7 (uj ) р. T (uy) 1] = f ;| 


This matrix, being diagonal, has a simpler form than [7] and conveys clearly that the operator T scales uj by a factor of 2 


and u by a factor of 3, information that is not immediately evident from [T]. 


One of the major themes in more advanced linear algebra courses is to determine the “simplest possible form” that can be 
obtained for the matrix of a linear operator by choosing the basis appropriately. Sometimes it is possible to obtain a 
diagonal matrix (as above, for example), whereas other times one must settle for a triangular matrix or some other form. 
We will only be able to touch on this important topic in this text. 


The problem of finding a basis that produces the simplest possible matrix for a linear operator 7:7 — can be attacked by 


first finding a matrix for T relative to any basis, typically a standard basis, where applicable, and then changing the basis in 
a way that simplifies the matrix. Before pursuing this idea, it will be helpful to revisit some concepts about changing bases. 


A New View of Transition Matrices 


Recall from Formulas 7 and 8 of Section 4.6 that if 5 = (uj, u5,..., Up} and В' = Tuj > ш, m ^ } are bases for a vector 


space V, then the transition matrices from B to B' and from 8" to B are 


Pg ,gr— [[ч1]в'|Гч2]в'|.|Гчн1в'1] (3) 


Pg',g 1[% Jal] stl Je! e 


where the matrices Pg ,p' and Рр’, р are inverses of each other. We also showed in Formulas 9 and 10 of that section 
that 1f y is any vector in V, then 


Pg .g'[v]g [v]g' (5) 


Pp ,g[v]g'— [v]g (6) 


The following theorem shows that transition matrices in Formulas 3 and 4 can be viewed as matrices for identity operators. 


THEOREM 8.5.1 


If B and 8" are bases for a finite-dimensional vector space V, and if 7:7 — Į is the identity operator on V, then 


Pp ,p! = [flere and Ppt ,p^ [gg 


Proof Suppose that В = (uj, u5,..., Up} and В' = [uj , ш, ээ un } are bases for V. Using the fact that (v) = v for all 


y in V, it follows from Formula 4 of Section 8.4 that 


[gs  -—lIup)] g K] g']--|D 02] e] 
= [[ui] g'|[u2] 2'-|[un) e] 
= Pg ,g' [Formula (3) above] 
The proof that [7] p g' = P g'g is similar. 


Effect of Changing Bases on Matrices of Linear Operators 


We are now ready to consider the main problem in this section. 


PROBLEM 
If B and 8" are two bases for a finite-dimensional vector space V, and if T: y —, [/ is a linear operator, what 


relationship, if any, exists between the matrices [7] p and [7] pr? 


The answer to this question can be obtained by considering the composition of the three linear operators on V pictured in 
Figure 8.5.1. 


I T I 


v v Ty) Tiv) 
V V V V 
Basis = B' Basis = B Basis = B Basis = B' 


Figure 8.5.1 


In this figure, y is first mapped into itself by the identity operator, then y is mapped into F(v) by T, and then T(v) is 
mapped into itself by the identity operator. All four vector spaces involved in the composition are the same (namely, V), but 
the bases for the spaces vary. Since the starting vector is y and the final vector is Z(v), the composition produces the same 
result as applying T directly; that is, 
T=loTol (7) 
If, as illustrated in Figure 8.5.1, if the first and last vector spaces are assigned the basis B and the middle two spaces are 
assigned the basis B, then it follows from 7 and Formula 12 of Section 8.4 (with an appropriate adjustment to the names of 
the bases) that 
[71] в'в' = [o Го/)] р' в' = U]g' pl 715 Bl] g g' (8) 
or, in simpler notation, 
[7]p' = 1g',.8[ 7] 5L] ee" (9) 
We can simplify this formula even further by using Theorem 8.5.1 to rewrite it as 


[7]g' = Рр_,в'[7]вРв'_,в (10) 


In summary, we have the following theorem. 


THEOREM 8.5.2 


Let T-V — p be a linear operator on a finite-dimensional vector space V, and let B and 5" be bases for V. Then 
[T] g =P" [T] gP (11) 


where P= Pg! ,p and p-1 — Pg pt 


Warning When applying Theorem 8.5.2, it is easy to forget whether P = Рр’ ,p (correct) or P = Pp ,g (incorrect). It 
may help to use the diagram in Figure 8.5.2 and observe that the exterior subscripts of the transition matrices match the 
subscript of the matrix they enclose. 


[7]. = Pg , s [Tlg PRB 


Exterior subscripts 


Figure 8.5.2 


In the terminology of Definition 1 of Section 5.2, Theorem 8.5.2 tells us that matrices representing the same linear operator 
relative to different bases must be similar. The following theorem is a rephrasing of Theorem 8.5.2 in the language of 
similarity. 


THEOREM 8.5.3 


Two matrices, A and B, are similar if and only if they represent the same linear operator. Moreover, if 8 = P -1 4р, 


then P 15 the transition matrix from the basis relative to matrix B to ће basis relative to matrix А. 


EXAMPLE 1 Similar Matrices Represent the Same Linear Operator — 


We showed at the beginning of this section that the matrices 


represent the same linear operator 7: 82 — 22. Verify that these matrices are similar by finding a matrix P for 


which p = Pep. 


Solution We need to find the transition matrix 
P=Pg' B= [[% ] [v] ]] 
where 2' = fui, ul is the basis for 22 given by 2 and B = (ei, ез) is the standard basis for 22. We see by 
inspection that 
uj =e; + е2 
u =e; + 2ez 


from which it follows that 
у _|1 му _|1 
milesi] #4 l=] 
Thus, 


P=Ppt_p= [ui ]g 


АЕНА 


We leave it for you to verify that 


and hence that 


bs [a i] Lead [na 


Similarity Invariants 


Recall from Section 5.2 that a property of a square matrix is called a similarity invariant if that property is shared by all 

similar matrices. In Table 1 of that section (table reproduced below), we listed the most important similarity invariants. 

Since we know from Theorem 8.5.3 that two matrices are similar if and only if they represent the same linear operator 

T.V. — V, it follows that if B and 8' are bases for V, then every similarity invariant property of [7] g is also a similarity 

invariant property of [7] g’ for any other basis BY for V. For example, for any two bases B and 8’ we must have 
det([7] a) = det([ T] g^) 

It follows from this equation that the value of the determinant depends on Т, but not on the particular basis that is used to 


obtain the matrix for T. Thus, the determinant can be regarded as a property of the linear operator T; indeed, if V is a finite- 
dimensional vector space, then we can define the determinant of the linear operator T to be 


det(£) = det([7] д) (12) 


where B is any basis for V. 


Table 1 Similarity Invariants 


Property Description 

Determinant A and p ^1 4p have the same determinant. 

Invertibility A is invertible if and only if р! АР is invertible. 

Rank A and рт! др have the same rank. 

Nullity A and p—! др have the same nullity. 

Trace A and p ^1 Др have the same trace. 

Characteristic A and р! Др have the same characteristic polynomial. 

polynomial 

Eigenvalues A and p ^1 4p have the same eigenvalues. 

Eigenspace If \ is an eigenvalue of A and Р! Др, then the eigenspace of A corresponding to Д and the 
dimension eigenspace of p ^1 4P corresponding to Д have the same dimension. 


EXAMPLE 2 Determinant of a Linear Operator + 


At the beginning of this section we showed that the matrices 


ТЇ meti 


represent the same linear operator relative to different bases, the first relative to the standard basis 2 = (e, ез) 
: . pl "m . 
for р? and the second relative to the basis 2 = { ш, u} for which 


[pf 


This means that [7] and [7] g’ must be similar matrices and hence must have the same similarity invariant 
properties. In particular, they must have the same determinant. We leave it for you to verify that 


11 2 0 
cel |= а 0 з = 6 


E 6 and det[7]g:— 


EXAMPLE 3 Eigenvalues and Bases for Eigenspaces <d 
Find the eigenvalues and bases for the eigenspaces of the linear operator T: P4 — P^ defined by 


Ta- bx cx?) = -2x4 (a + 2b Fe 4 (a+ зс}? 


Solution We leave it for you to show that the matrix for T with respect to the standard basis 
aill 
B= f 1 X,X \ 1S 


00 -2 
[T]g2|1 2 1 
10 3 


The eigenvalues of T are д — | and д = 2 (Example 7 of Section 5.1). Also from that example, the 
eigenspace of [7] p corresponding to Д = 2 has the basis (uj, uz} , where 


—1 0 
uj—| Of, uzg=]1 
1 0 
and the eigenspace of [7] p corresponding to \ — ] has the basis {u3} , where 
—2 
03 = 1 
1 


The matrices Uj, u2, and из are the coordinate matrices relative to B of 
рі = —14 xt р2=х, рз= —2-4x4 х? 


Thus, the eigenspace of T corresponding to  — 2 has the basis 


Ip. sid - [71 t xxl 


and that corresponding to д — 1 has the basis 


Inl- [729242 


As a check, you can use the given formula for T to verify that 


T(p1)=2p1, T(p3) = 2р2, and Т(рз) =рз 


Concept Review 

* Similarity of matrices representing a linear operator 
* Similarity invariant 

* Determinant of a linear operator 

Skills 


* Show that two matrices A and B represent the same linear operator, and find a transition matrix P so that 
-l 
В=Р AP. 


* Find the eigenvalues and bases for the eigenspaces of a linear operator on a finite-dimensional vector space. 


Exercise Set 8.5 


In Exercises 1—7, find the matrix for T relative to the basis B, and use Theorem 8.5.2 to compute the matrix for T relative 


to the basis 5". 


1. T- £? — р? is defined by 
| = Ё = "d 


Answer: 
-— 35 

1 -2 11 11 

11 11 


2. T: R? _, R? is defined by 


and 8 = (uj, uj) and 3! = Ívi, v2], where 
eel m] T Welt wl 
Ple ЖО tds ? [-2 


3. т. R? — R? is the rotation about the origin through an angle of 45°; B and B" are the bases in Exercise 1. 


Answer: 
i d 13 25 
nu? 96 nys 142 
y2 ү? 1/2 12 
4. T- R? — R? is defined by 
ži x1 2х9 = х3 
Т || х2 | | = -X2 
х3 x1 7х3 


and B 15 the standard basis for R? and B! = ivi, v3, уз}, where 
1 1 
vi—|0| ж = |1 |, 
0 0 


5. T: R? — R? is the orthogonal projection on the xy-plane, and B and 5" are as in Exercise 4. 


Answer: 
100 100 
[7]g2|]0 1 0, [7]g/2|]0 1 1 
000 000 
6. T: R? — д2 is defined by T(x) = 5x, and B and ' are the bases in Exercise 2. 


7. T. P4 — P, is defined by T(ag + ах) = ag + a(x + 1), and B= (pj, p2} and 2 = fai. q2 }, where py =6+ 3x 
, P2 = 10+ 2x, q1 = 2, q2 = 3 + 2x. 


Answer: 
2 2 
3 9 Ll 
2 3 

8. Find det(£). 


(a) T. R^ — R^, where Рх, x3) = (3x1 — 4x3, — x4 + 7x3) 
(b) T:R? — RÀ, where T(x1, x2, x3) = (x1 = х2, X2 — x3, 33 — x1) 
(©) T.P4 — Pa where T(p(x)) = p(x = 1) 


9. Prove that the following are similarity invariants: 
(a) rank 
(b) nullity 
(c) invertibility 


10. Let T: P4 — Рд be the linear operator given by the formula T(p(x)) = p(2x + 1). 
(a) Find a matrix for T relative to some convenient basis, and then use it to find the rank and nullity of T. 


(b) Use the result in part (a) to determine whether T is one-to-one. 


11. In each part, find a basis for 22 relative to which the matrix for T is diagonal. 
(a) T xinh — x1—22 
x2|} | 2xy 4х9 
(0 „Тт 4х] = х2 
xaj —3x|-Fx3 
Answer: 


os LT 


(b) -3- 421 -3 + {21 
= 6 6 


12. In each part, find a basis for R? relative to which the matrix for T is diagonal. 


(а) х1 = 2х] x2— х3 


Т || х2 | |= х= 253 = X3 
aa —X|-— х2—2х3 
(b) Х| =х2 + х3 
Т|| х2 | |= | =х1 + х3 
x3 ХІ +2 
(с) X1 4x| 4- x3 
T|| x2 | |= | 2x1 + 3x2 + 2x3 
х3 


x1 +4x3 


13. Let T: Pa — Рә be defined by 
Т{а Фах + ар?) = (540 + ба| + 2a3) 


— (а! | Baa jx + (ao — 223 y? 


(a) Find the eigenvalues of T. 
(b) Find bases for the eigenspaces of T. 


Answer: 


(a) A= —4, A=3 
(b) Basis for eigenspace corresponding to A= —4: —24 ox + x^; basis for eigenspace corresponding to 
А= 3:5 2х 4 x? 


14. Let T: Моз — M 3; be defined by 


(a) Find the eigenvalues of T. 
(b) Find bases for the eigenspaces of T. 


15. Let д be an eigenvalue of a linear operator Т: — P. Prove that the eigenvectors of T corresponding to д are the 
nonzero vectors in the kernel of AJ = Т. 


16. (a) Prove that if A and В are similar matrices, then 4? and p? are also similar. More generally, prove that 4* and p* 
are similar if k is any positive integer. 


(b) If 42 and p? are similar, must A and B be similar? Explain. 


17. Let C and D be ру x » matrices, and let 5 = (v4, v3, ..., v4) bea basis for a vector space V. Show that if 
C[x] g = [х] g for all x in V, then Ç = D. 


18. Find two nonzero 2 x 2 matrices that are not similar, and explain why they are not. 


19. Complete the proof below by justifying each step. 
Hypothesis: A and B are similar matrices. 
Conclusion: A and B have the same characteristic polynomial. 


Proof: 
l. det (м = B) = det (м «P ТАР) 


6. = det(M — A) 


20. If A and В are similar matrices, say p — P 1 AP, then it follows from Exercise 19 that A and B have the same 


eigenvalues. Suppose that Д is one of the common eigenvalues and x is a corresponding eigenvector of A. See if you can 
find an eigenvector of B corresponding to A (expressed in terms of A, x, and P). 


2 


пақ 


. Since the standard basis for 2” is so simple, why would one want to represent а linear operator оп R" in another basis? 
Answer: 


The choice of an appropriate basis can yield a better understanding of the linear operator. 


22. Prove that trace is a similarity invariant. 
True-False Exercises 
In parts (a)—(h) determine whether the statement is true or false, and justify your answer. 
(a) A matrix cannot be similar to itself. 
Answer: 


False 


(b) If A is similar to B, and B is similar to C, then A is similar to C. 
Answer: 


True 


(c) If A and B are similar and B is singular, then A is singular. 
Answer: 


True 


(d) If A and B are invertible and similar, then 4 -l and B —1 are similar. 


Answer: 


True 
(e) If T: R" — R” and T: R” — R” are linear operators, and if [71] g' g = [72] g', в with respect to two bases B and 8” 
for R”, then T, (x) = T3 (x) for every vector x in R” | 


Answer: 


True 


(f) If T, RP — R” is a linear operator, and if [71] g = [71] в' with respect to two bases B and B' for R”, then B= B'. 


Answer: 


False 


(в) If T:R” — R” is a linear operator, and if [7] в = 7, with respect to some basis B for R”, then T is the identity operator 
on R”. 


Answer: 


True 


(b) If T: R” — R” is a linear operator, and if [T] g' в = Гм with respect to two bases B and 8" for R”, then T is the identity 
operator on R”. 


Answer: 


False 
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A чы 


6. 


Chapter 8 Supplementary Exercises 


. Let A be an » ху matrix, В a nonzero y x » matrix, and x a vector in R” expressed in matrix notation. Is 


T(x) = Ах + В a linear operator on R”? Justify your answer. 


Answer: 


No. T(x4 + x2) = A(x] + x2) + B # (Ax + B) + (4x5 + B) = T(x1) + T(x5), and ifc = 1, then 
T(cx) =cAx+ В +с(Ах + B) =сТ(х). 


. Let 


Ü —sn 
д | 9% 
| sinf cos d 


(a) Show that 


20 —ш20 3 [cos30 —sin38 
А? — cos d £= 
pe cos20| ™ 5130 — cos38 


(b) Based on your answer to part (a), make a guess at the form of the matrix A” for any positive integer n. 


(c) By considering the geometric effect of multiplication by A, obtain the result in part (b) geometrically. 


. Let T: — P be defined by 7T(v) = ||v||v. Show that T is not a linear operator on V. 


. Let v1, v3, ..., Vy, be fixed vectors in R”, and let T: R” — R" be the function defined by 


T(x) = (x- V1, X- V2, -- X- V), where X - Vj is the Euclidean inner product on А”. 
(a) Show that T is a linear transformation. 


(b) Show that the matrix with row vectors vj, v5, ..., v, is the standard matrix for T. 


. Let (е, ез, ез, ед) be the standard basis for R4, and let т. д4 _, R? be the linear transformation for 


which 
T(e1) = (1, 2, 1), 7(е2) = (0, 1, 0), 
T(e3) = (1, 3, 0,  T(e4) = (1, 1, 1) 


(a) Find bases for the range and kernel of T. 
(b) Find the rank and nullity of T. 


Answer: 


(а) T(e4) and any two of T(e4), 7(e43), and T(e4) form bases for the range; ( — 1, 1, 0, 1) is a basis 
for the kernel. 
(b) Rank = 5, nullity = 1 


Suppose that vectors in R? are denoted by ] x 3 matrices, and define T- R? _, R? by 


E 
T([x x2 x3])= [x1 x2 x3]| 3 0 1 
225 


(a) Find a basis for the kernel of T. 
(b) Find a basis for the range of T. 
7. Let B = (vi, v3, v3, v4} bea basis for a vector space V, and let 7.7 — P be the linear operator for 

which 

T(vj) =v + v2 + v3 Зуд 

T(v3) =v] = V3 + 253 + 294 

T(w3) = 2%] = 4v3 + 5v3 + 5v4 

T(v4) 


— 2v, + бул — буз — 2v4 


(a) Find the rank and nullity of T. 


(b) Determine whether T is one-to-one. 


Answer: 


(a) Rank(£) = 2 and nulhty(¢) = 2 
(b) T is not one-to-one. 


8. Let V and W be vector spaces, let T, 74, and 75 be linear transformations from V to W, and let k be a scalar. 
Define new transformations, Т + 7 and 27, by the formulas 


(71 + T2) (x) = T(x) + T2(x) 
(KT) (x) = &(T()) 
(a) Show that (74 + 73): V — W and cT: V. — W are both linear transformations. 


(b) Show that the set of all linear transformations from V to W with the operations in part (a) is a vector 
space. 


9. Let A and B be similar matrices. Prove: 
(a) A7 and д7 are similar. 
(b) If A and В are invertible, then 47! and p 7! are similar. 


10. Fredholm Alternative Theorem Let T: — / bea linear operator on an n-dimensional vector space. 
Prove that exactly one of the following statements holds: 


(i) The equation T(x) = b has a solution for all vectors b in V. 
(ii) Nullity of T > 0. 
11. Let T: M 2; — M 3; be the linear operator defined by 


то) =| Jj | x | 


Find the rank апа nullity of T. 
Answer: 


Rank = 3, nullity = 1 


12. Prove: If A and B are similar matrices, and if B and C are also similar matrices, then A and C are similar 
matrices. 


13. 


14. 


15. 


16. 


17. 


Let L: M5 — M 53 be the linear operator that is defined by 4 (M | = М E Find the matrix for L with 


respect to the standard basis for M 55. 


Answer: 
1000 
0 0 10 
0100 
0001 
Let 8 = (uj, uz, u3} and В' = ivi. V2, vil be bases for a vector space V, and let 
2 =1 3 
Р=|1 14 
0 Е: 


be the transition matrix from 8" to В. 
(a) Express V1, V2, V3 as linear combinations of Uj, uz, U3. 


(b) Express Uj, u2, U3 as linear combinations of V1, V2, V3. 


Let В = (uj, uz, 03) bea basis for a vector space V, and let 7: — F be a linear operator for which 
-3 4 7 
[]в=| 10 -2 
01 0 


Find [77] p’, where 8 ‘= Ivi, V2, v3} is the basis for V defined by 


V|—Uuj, №7 = шщ 05,  v3;—Uu,--u;--u; 


Answer: 
—4 0 9 
[7]g'— 1 0 —2 
0 1 
Show that the matrices 


are similar but that 


are not. 
Suppose that T: —› F is a linear operator, and B is a basis for V for which 
X|—2X2--X3 X1 
[Т(х)]в=| 72 Е [х]в=|Х2 
х] = х3 х3 


Find [7] g. 


18. 
19. 


20. 


21. 


Answer: 


ixi. d 
[Tlp=|0 1 0 
LE dz 


Let 7. — р bea linear operator. Prove that T is one-to-one if and only if det(£) # 0. 
(Calculus required) 


(3) Show that iff — f (x)is twice differentiable, then the function 2: E | — со, х) =F | — со, х) 
defined by P (f) = f " (x) is a linear transformation. 


(b) Find a basis for the kernel of D. 
(c) Show that the set of functions satisfying the equation D(f) = f (x) is a two-dimensional subspace of 
p | — со, оо}, and find a basis for this subspace. 


Answer: 


(b) f(x) =2, g(x) = 1 
(с) f(x) =e", g(x) oe 


Let T: P5 — R? be the function defined by the formula 
p(=1) 
T(p(x)) =| Р@) 
Р(1) 


(а) Find T(x 5x4 6) 


(b) Show that T is a linear transformation. 


(c) Show that T is one-to-one. 


@ Find T * (0. 3, 0), 
(e) Sketch the graph of the polynomial in part (d). 


Let x1, X2, and х3 be distinct real numbers such that 
Х| < X2 X3 
and let т: P} — R? be the function defined by the formula 
р(х\) 
T(p(x)) =| p(x2) 
píxa) 


(a) Show that T is a linear transformation. 
(b) Show that T is one-to-one. 


(c) Verify that if 41, 42, апа 23 are any real numbers, then 
ay 
Т^\|| a2 | |= аР (х) + agPa(x) + азРз(х) 
43 
where 
Р.(ху xn % = 23) 
e) (x1 —%2)(x1 — х3) 


| x) % х3) 
Раю) = (x3 —xi)(x3 — x3) 


_ (x -x))(x-22) 
Рз(ю = (x3 = x1) (x3 —x2) 


(d) What relationship exists between the graph of the function 
a, Py (x) + a3P2(x) +азРз(х) 
and the points (x1, a1), (х2, a2), and (x3, a3)? 


Answer: 


(b) The points are on the graph. 


22. (Calculus required) Let р(х) and а(х) be continuous functions, and let V be the subspace of 
С (= со, + со) consisting of all twice differentiable functions. Define £: y — p by 


L(y(x)\=y" a) + р(х)у'(х) + @(х)у(х) 
(a) Show that L is a linear transformation. 
(b) Consider the special case where р(х) = 0 and g(x} = 1. Show that the function 
ф(х) —c,sm x --c2cos x 
is in the kernel of L for all real values ofc, and с2. 


23. Calculus required Let D: P,, — Py be the differentiation operator D (P) = P", Show that the matrix for D 


: f 2 А 
relative to the basis 8 = П, X X olas d is 


D. 4b dU ur 
0 0.2 0..10 
00 dps 0 
000 0. 
50... U 
24. Calculus required It can be shown that for any real number c, the vectors 
(x = cy (x — c)" 
А 2! P x! 


form a basis for ?,,. Find the matrix for the differentiation operator of Exercise 23 with respect to this 
basis. 


25. Calculus required J: Py, — Py, be the integration transformation defined by 


x 
J(p) - | анана. анта 
0 
= ax + cd +... + rrr die 
where p = ag + a4x +... + apx”. Find the matrix for J with respect to the standard bases for Р, and 
Py. 


Answer: 
0.0 0 0 
10 0 0 

1 
ER. 0 

T 
0 0 - 0 
000 1 
nel 
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| CHAPTER 


Numerical Methods 


CHAPTER CONTENTS 


9.1. LU-Decompositions 

9.2. The Power Method 

9.3. Internet Search Engines 

9.4. Comparison of Procedures for Solving Linear Systems 
9.5. Singular Value Decomposition 


9.6. Data Compression Using Singular Value Decomposition 


INTRODUCTION 


This chapter is concerned with *numerical methods" of linear algebra, an area of study 
that encompasses techniques for solving large-scale linear systems and for finding 
numerical approximations of various kinds. It is not our objective to discuss algorithms 
and technical issues in fine detail, since there are many excellent books on the subject. 
Rather, we will be concerned with introducing some of the basic ideas and exploring 
important contemporary applications that rely heavily on numerical ideas—singular value 
decomposition and data compression. A computing utility such as MATLAB, 
Mathematica, or Maple is recommended for Section 9.2 to Section 9.6 . 
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9.1 LU-Decompositions 


Up to now, we have focused on two methods for solving linear systems, Gaussian elimination (reduction to row 
echelon form) and Gauss-Jordan elimination (reduction to reduced row echelon form). While these methods are 
fine for the small-scale problems in this text, they are not suitable for large-scale problems in which computer 
roundoff error, memory usage, and speed are concerns. In this section we will discuss a method for solving a linear 
system of n equations in n unknowns that is based on factoring its coefficient matrix into a product of lower and 
upper triangular matrices. This method, called *LU-decomposition," is the basis for many computer algorithms in 
common use. 


Solving Linear Systems by Factoring 


Our first goal in this section is to show how to solve a linear system 4x = h of n equations in n unknowns by 
factoring the coefficient matrix A into a product 


A=LU (1) 


where L is lower triangular and U is upper triangular. Once we understand how to do this, we will discuss how to 
obtain the factorization itself. 


Assuming that we have somehow obtained the factorization in 1, the linear system Ах — h can be solved by the 
following procedure, called LU-decomposition. 


The Method of LU-Decomposition 
Step 1. Rewrite the system 4x — h as 


LUx=b (2) 


Step 2. Define a new » x ] matrix y by 


Ux=y (3) 


Step 3. Use 3 to rewrite 2 as Ly — b and solve this system for y. 
Step 4. Substitute y in 3 and solve for x. 


This procedure, which is illustrated in Figure 9.1.1, replaces the single linear system 4x — h by a pair of linear 
systems 

Ux=y 

Гу = 
that must be solved in succession. However, since each of these systems has а triangular coefficient matrix, it 
generally turns out to involve no more computation to solve the two systems than to solve the original system 


directly. 


Solve Ax = b 


Figure 9.1.1 


EXAMPLE 1 Solving Ах = b by LU-Decomposition + 


Later in this section we will derive the factorization 
2 6 2 2 0 
-3 —8 0 1 
4 9 2 4 —3 
A = L 
Use this result to solve the linear system 


2 62 
—3 —8 0| = |72 
2 


I 

| 

„з 
oo m. 
— WW = 


(4) 


С Dew 


ч 
Ls 
C v9 P2 М 


From 4 we can rewrite this system as 


(5) 


ч 
[К 
C ы PR М? 


Historical Note In 1979 an important library of machine-independent linear algebra 
programs called LINPACK was developed at Argonne National Laboratories. Many of the 
programs in that library use the decomposition methods that we will study in this section. 
Variations of the LINPACK routines are used in many computer programs, including 
MATLAB, Mathematica, and Maple. 


As specified in Step 2 above, let us define y1, уз, and уз by the equation 
L3 l X1 Jl 
0 3 x2| = |72 
0 0 1| [73 УЗ (6) 


С оз 


x = у 


which allows us to rewrite 5 as 


2 00 Jl 2 
—-3 10 J2| = |2 
4 —3 7| |73 3 (7) 
is у = b 
or equivalently as 
2y1 =2 
=i x3 =2 


4y1 — 3у2 + їуз = 3 
This system can be solved by a procedure that is similar to back substitution, except that we solve the 
equations from the top down instead of from the bottom up. This procedure, called forward 
substitution, yields 
yi=1, у2=5, y3=2 
(verify). As indicated in Step 4 above, we substitute these values into 6, which yields the linear 
system 


13 1//41 1 
01 3||[22|2|5 
0 0 1|[73 2 


or, equivalently, 
х] 3х9 + x3=1 
хэ + 3х3 = 5 
хз= 2 
Solving this system by back substitution yields 
xy=2, x2— = 1, x3—-2 
(verify). 


Alan Mathison Turing (1912-1954) 


Historical Note Although the ideas were known earlier, credit for popularizing the matrix 
formulation of the LU-decomposition is often given to the British mathematician Alan 
Turing for his work on the subject in 1948. Turing, one of the great geniuses of the twentieth 
century, is the founder of the field of artificial intelligence. Among his many 
accomplishments in that field, he developed the concept of an internally programmed 
computer before the practical technology had reached the point where the construction of 


such a machine was possible. During World War II Turing was secretly recruited by the 
British government's Code and Cypher School at Bletchley Park to help break the Nazi 
Enigma codes; it was Turing's statistical approach that provided the breakthrough. In addition 
to being a brilliant mathematician, Turing was a world-class runner who competed 
successfully with Olympic-level competition. Sadly, Turing, a homosexual, was tried and 
convicted of "gross indecency" in 1952, in violation of the then-existing British statutes. 
Depressed, he committed suicide at age 41 by eating an apple laced with cyanide. 

Umage: Time & Life Pictures/Getty Images, Inc.] 


Finding LU-Decompositions 


Example 1 makes it clear that after А is factored into lower and upper triangular matrices, the system Ах — can 
be solved by one forward substitution and one back substitution. We will now show how to obtain such 
factorizations. We begin with some terminology. 


DEFINITION 1 


A factorization of a square matrix А as 4 = 7,77, where L is lower triangular and U is upper triangular is 
called an LU-decomposition (or LU-factorization) of A. 


Not every square matrix has an LU-decomposition. However, we will see that if it is possible to reduce a square 
matrix A to row echelon form by Gaussian elimination without performing any row interchanges, then A will have 
an LU-decomposition, though it may not be unique. To see why this is so, assume that А has been reduced to a row 
echelon form U using a sequence of row operations that does not include row interchanges. We know from 
Theorem 1.5.1 that these operations can be accomplished by multiplying A on the left by an appropriate sequence 
of elementary matrices; that is, there exist elementary matrices #1, #3, ..., Ej such that 


Bp: ° -EEA =U (8) 


Since elementary matrices are invertible, we can solve 8 for A as 
рі ві... вті 
А= Еј Е Ер U 


or more briefly as 
A=LU (9) 
where 


—l g-1 —1 


We now have all of the ingredients to prove the following result. 


THEOREM 9.1.1 


If А is a square matrix that can be reduced to a row echelon form U by Gaussian elimination without row 
interchanges, then A can be factored as 4 = 7,77, where L is a lower triangular matrix. 


Proof Let L and U be the matrices in Formulas 10 and 8, respectively. The matrix U is upper triangular because it 
is a row echelon form of a square matrix (so all entries below its main diagonal are zero). To prove that L is lower 
triangular, it suffices to prove that each factor on the right side of 10 is lower triangular, since Theorem 1.7.15 will 
then imply that L itself is lower triangular. Since row interchanges are excluded, each E j results either by adding a 
scalar multiple of one row of an identity matrix to a row below or by multiplying one row of an identity matrix by a 
nonzero scalar. In either case, the resulting matrix E j is lower triangular and hence so is Ег! by Theorem 1.7.14. 


This completes the proof. 


EXAMPLE 2 AnLU-Decomposition — 


Find an LU-decomposition of 


2 62 
A—|-3 —8 0 
4 892 


Solution To obtain an LU-decomposition, 4 — 7,77, we will reduce A to a row echelon form U using Gaus: 
elimination and then calculate L from 10. The steps are as follows: 


Elementary Matrix 


Reduction to Corresponding to Inverse of the 
Row Echelon Form Row Operation the Row Operation Elementary Matrix 
2 6 2 
-3 -8 0 
4 9 2 
i | 100 20 0 
Хх ч 
Step 1 ЕА E-|o 1 0| E'-|0 1 0 
0 0 1 0 0 1! 


Il 
_ 
Ii 


100 100 
Step 2 (З хгом l)+row2 РЕ, = |3 0 Ej;-|-3 1 0 
0 0 I 0 0 1 


1 3 1 
0 1 3 
4 9 2 
100 1 00 
Step 3 (-4xrowl)+row3 Е= | 0 1 O| Ej-|[O 1 0 
-4 0 1 4 0 I 
1 3 1 
0 1 3 
0 3 -2 
100 1 0 «í 
Step 4 (3xrow2)+row3 Е= |0 1 0 Ел=|0 1 € 
0 3 I 0 -3 
1 3 1 
0 1 3 
0 0 7 
100 100 
Step 5 +x row 3 Е;=|0 1 0| £&;'=|]0 1 0 
0 0 i 0 0 7 
1 3 1 
о 1 3|-U 
0 0 1 


and, from 10, 


200] 10 o]f1 о olf1 о 0][100 
Ł = |0 1 of/-3 1 ollo 1 offo 10010 
001| оо 1|40 10 -3 1| 0 0 7 
2 00 
= |-3 10 
4—37 
SO 
2 62 2 (91 3 4 
-3 -8 o/=|-3 10]|0 123 
4 92 4 —3 7||0 0 1 


is an LU-decomposition of A. 


Bookkeeping 


As Example 2 shows, most of the work in constructing an LU-decomposition is expended in calculating L. 
However, all this work can be eliminated by some careful bookkeeping of the operations used to reduce A to U. 


Because we are assuming that no row interchanges are required to reduce A to U, there are only two types of 
operations involved—multiplying a row by a nonzero constant, and adding a scalar multiple of one row to another. 
The first operation is used to introduce the leading 1's and the second to introduce zeros below the leading 1's. 


In Example 2, a multiplier of i was needed in Step 1 to introduce a leading 1 in the first row, and a multiplier of i 


was needed in Step 5 to introduce a leading 1 in the third row. No actual multiplier was required to introduce a 
leading 1 in the second row because it was already a 1 at the end of Step 2, but for convenience let us say that the 
multiplier was 1. Comparing these multipliers with the successive diagonal entries of L, we see that these diagonal 
entries are precisely the reciprocals of the multipliers used to construct U: 


2 00 
L=|-3 10 (11) 
4-3 7 


Also observe in Example 2 that to introduce zeros below the leading 1 in the first row, we used the operations 
add 5 times the first row to the second 
add—4 times the first row to the third 
and to introduce the zero below the leading 1 in the second row, we used the operation 
add 5 times the second row to the third 
Now note in 12 that in each position below the main diagonal of L, the entry is the negative of the multiplier in the 
operation that introduced the zero in that position in U: 


2 00 
L-|-3 10 (12) 
43-7 


This suggests the following procedure for constructing an LU-decomposition of a square matrix A, assuming that 
this matrix can be reduced to row echelon form without row interchanges. 


Procedure for Constructing ап LU-Decomposition 


Step 1. Reduce А to a row echelon form U by Gaussian elimination without row interchanges, keeping 
track of the multipliers used to introduce the leading 1's and the multipliers used to introduce the 
zeros below the leading 1'$. 


Step 2. In each position along the main diagonal of L, place the reciprocal of the multiplier that introduced 
the leading 1 in that position in U. 


Step 3. In each position below the main diagonal of L, place the negative of the multiplier used to 
introduce the zero in that position in U. 


Step 4. Form the decomposition 4 = 7,7. 


EXAMPLE 3 Constructing an LU-Decomposition — 


Find an LU-decomposition of 


6 —2 
А=|9 —1 1 
E 


Solution We will reduce A to a row echelon form U and at each step we will fill in an entry of L in 
accordance with the four-step procedure above. 


6 —2 0 e 0 0 
A=|9 —1 1 e è 0 
3 7 5 eee 
1 —— i ie == 1 
D -4 0 multiplier = + 600 
9 —1 l e o 0 
3 7 5 eee 
1 -i 0 6 0 0 
© 2 1 | — multiplier = —9 9 e. 0 
©) 8 S|-«— multiplier = —3 3 o% o 
—l 
r 600 
0 (D Zi|-— multiplier = + 920 
0 8 5 3 o e 
1 -i 0 . 
3 | 6 0 0 
0 l 375 920 
0 ©) 1 | <— multiplier = —8 3 8 e 
_1 
l 3 6 0 0 No actual operation is 
U = 0 1 _ performed here since 
о L=|9 2 0 there is already a leading 
0 0 <— multiplier = | 3 8 #1 1 іп the third row. 


Thus, we have constructed the LU-decomposition 


1 
600]! 73 ° 
A-LU-|9 20! g 41 
381 2 
Ü. 0 1 


We leave it for you to confirm this end result by multiplying the factors. 


LU-Decompositions Are Not Unique 


In the absence of restrictions, LU-decompositions are not unique. For example, if 
A 0 0 ||1 up #13 
A-—LU-|i,1i5 0 |0 1 ug 
i4 3 531100 1 


and L has nonzero diagonal entries, then we can shift the diagonal entries from the left factor to the right factor by 
writing 


1 0 оО 0 0 ||1 #12 ut3 
А = |і 1 00 i3 0 |0 1 ug 
ia fha 1821102 1] 0.0 5110 0 1 


1 0 О іда Juiz 2112213 
fon feat 1 0000 115 dua 
ip/h 3/1553 1| 0 0 153 


which is another LU-decomposition of А. 


LDU-Decompositions 


The method we have described for computing LU-decompositions may result in an *asymmetry" in that the matrix 

U has I's on the main diagonal but L need not. However, if it is preferred to have 1's on the main diagonal of the 

lower triangular factor, then we can "shift" the diagonal entries of L to a diagonal matrix D and write L as 
L-L'D 

where 7," is a lower triangular matrix with 1's on the main diagonal. For example, a general 3 x 3 lower triangular 

matrix with nonzero entries on the main diagonal can be factored as 


aj 0 0 1 0 0 ayy 0 0 

az, аз) 0 = [jan/an 1 0 0 ay 0 

43] азу 433 az faj azlan 1 0 0 a3 
L L' D 


Note that the columns of £" are obtained by dividing each entry in the corresponding column of L by the diagonal 
entry in the column. Thus, for example, we can rewrite 4 as 


2 62 2 001131 
-3 -8 0| = |-3 10013 
4 92 4 —3 711001 

3 5 Oe o ot 31 

= |-2 10010013 

> ыз 1110 0 7001 


Опе can prove that if А is a square matrix that сап be reduced to row echelon form without row interchanges, then 
A can be factored uniquely as 


AzLDU 
where L is a lower triangular matrix with 1's on the main diagonal, D is a diagonal matrix, and U is an upper 


triangular matrix with 1's on the main diagonal. This is called the LDU-decomposition (or LDU-factorization) of 
A. 


PLU-Decompositions 


Many computer algorithms for solving linear systems perform row interchanges to reduce roundoff error, in which 


case the existence of an LU-decomposition is not guaranteed. However, it is possible to work around this problem 
by “preprocessing” the coefficient matrix A so that the row interchanges are performed prior to computing the 
LU-decomposition itself. More specifically, the idea is to create a matrix Q (called a permutation matrix) by 
multiplying, in sequence, those elementary matrices that produce the row interchanges and then execute them by 
computing the product ОА. This product can then be reduced to row echelon form without row interchanges, so it is 
assured to have an LU-decomposition 


QA— LU (13) 


Because the matrix О is invertible (being a product of elementary matrices), the systems 4x = h and (24x = Qh 
will have the same solutions. But it follows from 13 that the latter system can be rewritten as Ux = Qh and hence 
can be solved using LU-decomposition. 


It is common to see Equation 13 expressed as 
А= PLU (14) 


in which P — Q -l This is called a PL U-decomposition or (PLU-factorization) of A. 


Concept Review 
* LU-decomposition 
* LDU-decomposition 
* PLU-decomposition 


Skills 

* Determine whether a square matrix has an LU-decomposition. 
* Find an LU-decomposition of a square matrix. 

* Use the method of LU-decomposition to solve linear systems. 
* Find the LDU-decomposition of a square matrix. 


* Find a PLU-decomposition of a square matrix. 


Exercise Set 9.1 


1. Use the method of Example 1 and the LU-decomposition 
3 —-6| | 3 01 —2 
—2 5| [-2 1][0 1 


3x;—6x3 =0 
—2x;--5x3 =1 


to solve the system 


Answer: 


X122, x22] 
2. Use the method of Example 1 and the LU-decomposition 
3 —6 —3 3 0 01 -2 —1 
2 0 6|[2| 2 40110 1 2 
-4 7 4 —4 —12[|0 0 1 
to solve the system 
3x, —6x9=—3x3= =3 
2x1 + x3 = —22 
= 4х1 7х2 4х3 = 3 


In Exercises 3—10, find ап LU-decomposition of the coefficient matrix, and then use the method of Example 1 to 
solve the system. 


[а Ads] [2 


Answer: 


— 5 2]||43 6 
Answer: 
ху= —1, хэ= 1, x3=0 
6.|-5 12 -6||xXi —33 
1-2 2 || х2 |= 7 
0 1 1 || *3 =1 
7, 5 5 10]/*1 0 
=8 —? —-8||x2|]—|1 
0 4 26]||*3 
Answer 
xy= —1, x2= 1, x3=0 
8.|-1 = —4 || x1 —6 
3 10 —10||52|=|—3 
—2 —4 11 || ^3 9 
9.| —1 0 ЕЗ 
2 3 -26|72| |-1 
0 —1 2 oj *3] | 
0 0 1 5||*4 7 


Answer: 


Хр= —3, x2= 1, x3—2, x4—1 
10.12 —-4 0 Of] %1 8 

1 2 12 0]|43 0 

0 —1 —4 -5||43 1 

0 0 2 11 |[ ^4 0 


11. Let 
2 1 -1 
А=|—2 -1 2 
2 ie 0 


(a) Find an LU-decomposition of A. 


(b) Express А in the form A = L1 DU, where д is lower triangular with 1's along the main diagonal, #7 is 
upper triangular, and D 15 a diagonal matrix. 


(c) Express A in the form A = £505, where з is lower triangular with 1's along the main diagonal and 7; is 
upper triangular. 


Answer: 
(a) 2 0 01112-1 
A=lU=|-210]], $ 4 
РТ: 
(b) 100][[200]1 2 -i 
A-L,DU,-|—1 1 0| 0 10 
1 о 11001100 ]1 
00 1 
(с) 10 0][2 1 -1 
A-LQ0U4-|-1 1 0|0 0 1 
10 1|[00 


12 2 2 
А= 
ni 
13. 3 —-12 6 
А=|0 2 0 
6 —28 13 
Answer 
1 00130 0111-4 2 
А= |0 1 01102 оО 1 0 
2 -2 1100 1714/0 0 1 


(a) Show that the matrix 


has no LU-decomposition. 


(b) Find a PLU-decomposition of this matrix. 


In Exercises 15-16, use the given PLU-decomposition of A to solve the linear system 4x = Ъ by rewriting it as 
P -l Ак = Р -p and solving this system by LU-decomposition. 


15. 2 014 
B o | Tp AS] 1 2 2| 
5 3 143 
010110 0112 2 
А = |10 01101 0110 14 |—PZU 
0 0 1||3 —5 1110 0 17 
Answer: 
zl ru C 20 de 
TETUER ST 
16. 3 4 |2 
b = |0}; A=/0 2 1k 
6 8 18 
1001 00114 12 
A= |00 1112 1010 =1 4| —- PLU 
0o 1 ollo UOI 0 9 


In Exercises 17-18, find a PLU-decomposition of A, and use it to solve the linear system Ах = h by the method 
of Exercises 15 and 16. 


17. 3 = A 
А=|3 -1 1; b=] 1 
0 
Answer 
1 
1 о 0113 о 011. 73 0 , 
A=|0 0 ijjo 2 olf, | if а= -1. m= 5. 33 
010130 1 2 
0 0 1 
18. 03 —2 7 
А=|11 4[|[Ъ=| 5 
297 “5 = 2 
19. Let 


(a) Prove: If g #0, then the matrix A has a unique LU-decomposition with 1's along the main diagonal of L. 
(b) Find the LU-decomposition described in part (a). 


Answer: 
(b) a b 1 буа b 
a a 


20. Let Ах = h be a linear system of n equations in n unknowns, and assume that А is an invertible matrix that can 
be reduced to row-echelon form without row interchanges. How many additions and multiplications are 
required to solve the system by the method of Example 1? 


21. Prove: If A is any » x д matrix, then A can be factored as 4 — PLU, where L is lower triangular, U is upper 
triangular, and P can be obtained by interchanging the rows of /,, appropriately. [Hint: Let U be a row echelon 
form of А, and let all row interchanges required in the reduction of А to U be performed first.] 


True-False Exercises 
In parts (a)-(e) determine whether the statement 15 true or false, and justify your answer. 
(a) Every square matrix has an LU-decomposition. 

Answer: 


False 


(b) If a square matrix A is row equivalent to an upper triangular matrix U, then А has an LU-decomposition. 
Answer: 


False 


(c) If £1, L5, ..., Lj, are y x м lower triangular matrices, then the product 517,5 * ++ Др is lower triangular. 
Answer: 


True 


(d) If a square matrix A has an LU-decomposition, then А has a unique LDU-decomposition. 
Answer: 


True 


(e) Every square matrix has a PLU-decomposition. 
Answer: 


True 
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9.2 The Power Method 


The eigenvalues of a square matrix can, in theory, be found by solving the characteristic equation. However, this 
procedure has so many computational difficulties that it is almost never used in applications. In this section we will 
discuss an algorithm that can be used to approximate the eigenvalue with greatest absolute value and a corresponding 
eigenvector. This particular eigenvalue and its corresponding eigenvectors are important because they arise naturally in 
many iterative processes. The methods we will study in this section have recently been used to create Internet search 
engines such as Google. We will discuss this application in the next section. 


The Power Method 


There аге many applications in which some vector Xg in R” is multiplied repeatedly by an » д matrix A to produce a 
sequence 
2 k 
xg, xp, <Axp,.... AX, -.- 
We call a sequence of this form a power sequence generated by A. In this section we will be concerned with the 


convergence of power sequences and how such sequences can be used to approximate eigenvalues and eigenvectors. 
For this purpose, we make the following definition. 


DEFINITION 1 


If the distinct eigenvalues of a matrix А are Ay, Ag, ..., Ак, and if |А; | is larger than 3], к Ак |, then Aj is 
called a dominant eigenvalue of A. Any eigenvector corresponding to a dominant eigenvalue 15 called a 


dominant eigenvector of A. 


EXAMPLE 1 Dominant Eigenvalues + 


Some matrices have dominant eigenvalues and some do not. For example, if the distinct eigenvalues of a 
matrix are 

A= —4, M—-2, Azg=1, Aq=3 
then Ду = — 4 is dominant since I| — 4 is greater than the absolute values of all the other eigenvalues; 
but if the distinct eigenvalues of a matrix are 

А =?, Аз=—7, à4——2, А= 5 
then |À1| = |A2| = 7, so there is no eigenvalue whose absolute value is greater than the absolute value of 
all the other eigenvalues. 


The most important theorems about convergence of power sequences apply to » x х matrices with n linearly 
independent eigenvectors (symmetric matrices, for example), so we will limit our discussion to this case in this section. 


THEOREM 9.2.1 


Let A be a symmetric » x » matrix with a positive dominant eigenvalue д. If X is a unit vector in R” that is 
not orthogonal to the eigenspace corresponding to À, then the normalized power sequence 


_ Ax __Ax _ _ _ Axg р 
х0, = ао" ®2= a ee a (0 


converges to а unit dominant eigenvector, and the sequence 
Ахү х1,  4x2:Xj 4x3: X4. AX, Xi... (2) 


converges to the dominant eigenvalue i. 


Remark In the exercises we will ask you to show that 1 can also be expressed as 


2 


Xp. x = A, xj- A, a (3) 
|| Axo || Axo IA" xgll 


This form of the power sequence expresses each iterate in terms of the starting vector xg, rather than in terms of its 
predecessor. 


We will not prove Theorem 9.2.1, but we can make it plausible geometrically in the 2 x 2 case where A is a symmetric 
matrix with distinct positive eigenvalues, Ау and Аз, one of which is dominant. To be specific, assume that A, 15 
dominant and 

Ay > Ag > 0 
Since we are assuming that A is symmetric and has distinct eigenvalues, it follows from Theorem 7.2.2 that the 
eigenspaces corresponding to Ay and Аз are perpendicular lines through the origin. Thus, the assumption that xg is a 
unit vector that is not orthogonal to the eigenspace corresponding to Aj implies that хо does not lie in the eigenspace 
corresponding to Аз. To see the geometric effect of multiplying xg by A, it will be useful to split хо into the sum 


хр = vp + wp (4) 


where Vg and wg are the orthogonal projections of xg on the eigenspaces of A, and Аз, respectively (Figure 9.2.1a). 


À Vo + À3 Wo 


Eigenspace А, | Eigenspace A, 


хо ум \ 


x 


Eigenspace A,| Eigenspace A, 


be A 


(а) (b) (c) 


Figure 9.2.1 


This enables us to express Ахп as 


Axg = Avg + Awg = Ауур + Àgwg (5) 


which tells us that multiplying Xp by A “scales” the terms Vg and Wp in 4 by Aj and Аз, respectively. However, Aq is 
larger than Аз, so the scaling is greater in the direction of Vg than in the direction of wg . Thus, multiplying Xp by А 
“pulls” xg toward the eigenspace of Aj, and normalizing produces a vector xj = Axg / || 4xg||, which is on the unit 
circle and is closer to the eigenspace of A; than xp (Figure 9.2.15). Similarly, multiplying X1 by А and normalizing 
produces a unit vector X2 that is closer to the eigenspace of ^, than х]. Thus, it seems reasonable that by repeatedly 
multiplying by А and normalizing we will produce a sequence of vectors Xj that lie on the unit circle and converge to a 
unit vector x in the eigenspace of Aj (Figure 9.2.1c). Moreover, if X; converges to x, then it also seems reasonable that 
Ax), * xy, will converge to 

Ax: x Ax: х= Alix? =A, 


which is the dominant eigenvalue of А. 


The Power Method with Euclidean Scaling 


Theorem 9.2.1 provides us with an algorithm for approximating the dominant eigenvalue and a corresponding unit 
eigenvector of a symmetric matrix А, provided the dominant eigenvalue is positive. This algorithm, called the power 
method with Euclidean scaling, is as follows: 


The Power Method with Euclidean Scaling 


Step 1. Choose an arbitrary nonzero vector and normalize it, if need be, to obtain a unit vector хү. 


Step 2. Compute Axg and normalize it to obtain the first approximation Xj to a dominant unit eigenvector. 
Compute Ax; - ху to obtain the first approximation to the dominant eigenvalue. 


Step 3. Compute Ax; and normalize it to obtain the second approximation X2 to a dominant unit eigenvector. 
Compute Ax3 · хэ to obtain the second approximation to the dominant eigenvalue. 


Step 4. Compute Ах» and normalize it to obtain the third approximation X3 to a dominant unit eigenvector. 
Compute Ахз + x3 to obtain the third approximation to the dominant eigenvalue. 


Continuing in this way will usually generate a sequence of better and better approximations to the dominant 


eigenvalue and a corresponding unit eigenvector. 


EXAMPLE 2 The Power Method with Euclidean Scaling + 


Apply the power method with Euclidean scaling to 


"= =U 


Stop at X5 and compare the resulting approximations to the exact values of the dominant eigenvalue and 
eigenvector. 


Solution We will leave it for you to show that the eigenvalues of А are 4 — 1 and 4 — 5 and that the 
eigenspace corresponding to the dominant eigenvalue À — 5 is the line represented by the parametric 
equations x; = £, x3 = £, which we can write in vector form as 


1 B 


Setting — 1 / "E yields the normalized dominant eigenvector 


ере y2 ,, | 0.707106781187... 
17 | | |% [0.707106781187... (7) 
V2 
Now let us see what happens when we use the power method, starting with the unit vector Xp. 
Jie 3 2||1 = 3 == Ах __1 3 : 1 3 E 0.83205 
2 3||0 2 ||. Ахо|| "IE! 2 3.60555 |2 0.55470 
P 3 2||0.83205 e 3.60555 xu Ax] Е 1 3.60555 m 0.73480 
2 3] | 0.55470 3.32820 ||. Ax || 4.90682 | 3.32820 0.67828 
disi" 3 2||0.73480 E 3.56097 = 4х2 - 1 3.56097 "е 0.71274 
2 3||0.67828 3.50445 || Ax|| 4.99616 | 3.50445 0.70143 
Ам 5 2||0.71274 - 3.54108 iss Ax3 = 1 3.54108 » 0.70824 
2 3] | 0.70143 3.52976 || Ахз|| 4.99985 | 3.52976 0.70597 
Axim 3 2||0.70824 © 3.53666 t= Ахд - 1 3.53666 - 0.70733 
2 3||0.70597 3.53440 || Ax4]| 4.99999 | 353440 0.70688 
(D. теге NR 0.83205| . 
^ [+] xy = (4х) x, = [3.60555 32820] уздш 22 4.84615 
0) _ — T. na 0.73480 | . 
A [+ x2 = (Ахд) x2 = [3.56097 3.50445] o ope zæ 4.99361 
б) _ 2 Y. a 071274 |... 
A CJ хз = (4х3) хз æ [3.54108 0976] ууз g 4.99974 
(5. m y NR 0.70824 | 
A [+x x4= (4х4) хд: [3.53666 3534401 угуз a 4.99999 
о _ MEE rm 0.70733 | 
А CJ х5 = (4х5) ху: [3.53576 шр 22 5.00000 


Thus, 4б) approximates the dominant eigenvalue to five decimal place accuracy and x5 approximates the 
dominant eigenvector in 7 correctly to three decimal place accuracy. 


It is accidental that ,© (the fifth approximation) 
produced five decimal place accuracy. In general, n 
iterations need not produce n decimal place 
accuracy. 


The Power Method with Maximum Entry Scaling 


There is a variation of the power method in which the iterates, rather than being normalized at each stage, are scaled to 
make the maximum entry 1. To describe this method, it will be convenient to denote the maximum absolute value of the 
entries in a vector x by max(x). Thus, for example, if 


then max(x) — 7. We will need the following variation of Theorem 9.2.1. 


THEOREM 9.2.2 


Let A be a symmetric » x p matrix with a positive dominant" eigenvalue д. If Xg is a nonzero vector in R” that 
1s not orthogonal to the eigenspace corresponding to À, then the sequence 


xg _ Aj EM 


T» XIT (Дуу, 727 шак(Ахр) ET max (Axe) G) 


converges to an eigenvector corresponding to A, and the sequence 


Axy * X] 4х) * X2 Ax3 * X3 Ах; * Xi 9 
хх] C X27X3 ' X3'X3 ' ^ ХЕ Хд (9) 


converges to А. 


Remark In the exercises we will ask you to show that 8 can be written in the alternative form 


2 


"XL XT Lido. xm ES (10) 
max(Axg) max (4 хо) max(A хо) 


which expresses the iterates in terms of the initial vector хү. 


We will omit the proof of this theorem, but if we accept that 8 converges to an eigenvector of A, then it is not hard to see 
why 9 converges to the dominant eigenvalue. For this purpose we note that each term in 9 is of the form 


Ax:x 
“xx e 


which is called a Rayleigh quotient of A. In the case where à is an eigenvalue of A and x is a corresponding eigenvector, 
the Rayleigh quotient is 


dx:x  Axex _ AMX) _, 
FEES x'Xx x'x И 
Thus, if X& converges to а dominant eigenvector x, then it seems reasonable that 
„бж Xk converges to Axx _ 
Xk Xk х.х 


which is the dominant eigenvalue. 


Theorem 9.2.2 produces the following algorithm, called the power method with maximum entry scaling. 


The Power Method with Maximum Entry Scaling 


Step 1. Choose an arbitrary nonzero vector хү. 

Step 2. Compute Ax, and multiply it by the factor 1 / max(Axg) to obtain the first approximation X1 to a 
dominant eigenvector. Compute the Rayleigh quotient of X, to obtain the first approximation to the 
dominant eigenvalue. 

Step 3. Compute Ax, and scale it by the factor 1 / max( Ах ү) to obtain the second approximation X2 to a 
dominant eigenvector. Compute the Rayleigh quotient of X2 to obtain the second approximation to the 
dominant eigenvalue. 

Step 4. Compute Ах and scale it by the factor 1 / max( 4х2) to obtain the third approximation X3 to а 
dominant eigenvector. Compute the Rayleigh quotient of X3 to obtain the third approximation to the 
dominant eigenvalue. 

Continuing in this way will generate a sequence of better and better approximations to the dominant 
eigenvalue and a corresponding eigenvector. 


John William Strutt Rayleigh (1842-1919) 


Historical Note The British mathematical physicist John Rayleigh won the Nobel prize in physics їп 1904 for 
his discovery of the inert gas argon. Rayleigh also made fundamental discoveries in acoustics and optics, and 
his work in wave phenomena enabled him to give the first accurate explanation of why the sky is blue. 

[/mage: The Granger Collection, New York] 


EXAMPLE 3 Example 2 Revisited Using Maximum Entry Scaling + 
Apply the power method with maximum entry scaling to 


"3 =U 


Stop at X5 and compare the resulting approximations to the exact values and to the approximations 
obtained in Example 2. 


Solution We leave it for you to confirm that 


Ах 1[3 1.00000 
Axy = = —__Axg  _ 1/7} 

B Е JIH H 1 max( Axo) 3) р 
Ax 13 2| { 1:90000] _ [4.33333 __ Ay _ 4.33333] [1.00000 
1" [5 з | [0.66667 |“ | 4.00000 gr Вст 4.00000 | | 0.92308 
Аан 3 2][1.00000]. [4.84615 -— Ax? 4.84615] _ | 1.00000 
2 3||0.92308 4.76923 max( Ax?) eres 4.76923 0.98413 
dieu 3 2][1.00000]. [4.96825 — Axa 4.96825 1.00000 
2 3|[0.98413 4.95238 max( Ax3) е 968957 4.95238 0.99681 
Ax. 13 2] [100000 4.99361 xA 4.99361] _ [1.00000 
4**|2 3110.99681 4.99042 57 max(Axa) EI ELTE 499042| | 0.99936 


Т 
\@ Axx _ CAxi) ху. 7.00000, 4 94615 


X|'X| x? x; 1.44444 
А = Aum ар 01, 30080 d 
А — um Хар за „30000 адзн 
— E 3988 nam 


E = x 1.99872 


Thus, 4б) approximates the dominant eigenvalue correctly to five decimal places and х5 closely 
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approximates the dominant eigenvector 


that results by taking ¢ — ] in 6. 


Whereas the power method with Euclidean scaling 
produces a sequence that approaches a unit 
dominant eigenvector, maximum entry scaling 
produces a sequence that approaches an eigenvector 
whose largest component is 1. 


Rate of Convergence 


If A is a symmetric matrix whose distinct eigenvalues can be arranged so that 
> Pal Pal 2-2 Ак 
then the “rate” at which the Rayleigh quotients converge to the dominant eigenvalue A; depends on the ratio А: | Í A2 |; 


that is, the convergence is slow when this ratio is near 1 and rapid when it is large—the greater the ratio, the more rapid 
the convergence. For example, if А is a 2 x 2 symmetric matrix, then the greater the ratio I / А2 |. the greater the 


disparity between the scaling effects of A, and Аз in Figure 9.2.1, and hence the greater the effect that multiplication by 
A has on pulling the iterates toward the eigenspace of А. Indeed, the rapid convergence in Example 3 is due to the fact 
that |А | / A] = 5 / 1 — 5, which is considered to be а large ratio. In cases where the ratio is close to 1, the 
convergence of the power method may be so slow that other methods must be used. 


Stopping Procedures 


If À is the exact value of the dominant eigenvalue, and if a power method produces the approximation , at the kth 
iteration, then we call 


AAT 


A (12) 


the relative error in X®). If this is expressed as a percentage, then it is called the percentage error in А2 — For 


example, if å = 5 and the approximation after three iterations is А —5. 1, then 


relative error in А) = NX 


= = = |— 0.02| = 0.02 


percentage error in AO = 0,02 x 100% = 294 


In applications one usually knows the relative error Е that can be tolerated in the dominant eigenvalue, so the goal is to 
stop computing iterates once the relative error in the approximation to that eigenvalue is less than Е. However, there is a 
problem in computing the relative error from 12 in that the eigenvalue à is unknown. To circumvent this problem, it is 
usual to estimate A by 409 and stop the computations when 


ee о, 


x9 In” (13) 


The quantity on the left side of 13 is called the estimated relative error in A9 and its percentage form is called the 
estimated percentage error in 9. 


EXAMPLE 4 Estimated Relative Error + 


For the computations in Example 3, find the smallest value of k for which the estimated percentage error 
in 409 is less than 0.1%. 


Solution The estimated percentage errors in the approximations in Example 3 are as follows: 


APPROXIMATION RELATIVE PERCENTAGE 
ERROR ERROR 


Q. XQ 
AQ. pr - A ыш = 0.02953 = 2.95394 


XO 4.99361 
(OQ. LAO LAO |_| 4.99974 4.99361 |. _ " 
A o ^ 199974 æ 0.00123 = 0.123% 
(&. AQ 
XQ. gn м а ая =æ 0.00005 = 0.005% 
А . 


ps] 


Б 5.00000 


| 5.00000 — 4.99999 | ы 6: 56006 «i096 
А 


Thus, AC? — 4 99999 is the first approximation whose estimated percentage error is less than 0.1%. 


Remark Arule for deciding when to stop an iterative process 1s called a stopping procedure. In the exercises, we will 
discuss stopping procedures for the power method that are based on the dominant eigenvector rather than the dominant 
eigenvalue. 


Concept Review 


Power sequence 


Dominant eigenvalue 


Dominant eigenvector 


Power method with Euclidean scaling 


Rayleigh quotient 


Power method with maximum entry scaling 


Relative error 


Percentage error 


Estimated relative error 


Estimated percentage error 


Stopping procedure 
Skills 


* [dentify the dominant eigenvalue of a matrix. 
* Use the power methods described in this section to approximate a dominant eigenvector. 


* Find the estimated relative and percentage errors associated with the power methods. 


Exercise Set 9.2 


In Exercises 1—2, the distinct eigenvalues of a matrix are given. Determine whether А has a dominant eigenvalue, and 
if so, find it. 


La) 427, M23, з= —8, 4-1 
(b) Ay = —5, 923, 322, 4-5 


Answer: 


(a) Аз dominant 
(b) No dominant eigenvalue 


2. (а) 21, № =0, з= —3, м= 2 
(b Ay = – 3, 92 —2, з= – 1, м=3 


In Exercises 3-4, apply the power method with Euclidean scaling to the matrix A, starting with хо and stopping at X4. 
Compare the resulting approximations to the exact values of the dominant eigenvalue and the corresponding unit 
eigenvector. 


Answer: 


«I 098058]. „ | 098837] _ | 098679] . [ 098715]. 
P 20.19612 %25] 0.15206 l теа |" | —0.15977 l 


dominant eigenvalue: А = 2 + {10+ 5.16228; 


| | 1 1 
dominant eigenvector: Ё - m ii E T 
4. 7-2 0 1 
A-|-2 6 -2| х=|0 
0-2 5 0 


In Exercises 5—6, apply the power method with maximum entry scaling to the matrix A, starting with xg and stopping 
at X4. Compare the resulting approximations to the exact values of the dominant eigenvalue and the corresponding 
scaled eigenvector. 


573 lt 


Answer: 


x} = E oM ЕЯ АО 6/6: ы pu XO a. 6.60550; 


— loi! X w 6.60555; 


dominant eigenvalue: À — 3 4- y 13% 6.60555; 


3 


E | 264/13 | [047186 
ominant eigenvector: 2+ hs 3 RS 0.98167 
26 + AJ 13 


(a) Use the power method with maximum entry scaling to approximate a dominant eigenvector of A. Start with X, 
round off all computations to three decimal places, and stop after three iterations. 


(b) Use the result in part (a) and the Rayleigh quotient to approximate the dominant eigenvalue of A. 
(c) Find the exact values of the eigenvector and eigenvalue approximated in parts (a) and (b). 


(d) Find the percentage error in the approximation of the dominant eigenvalue. 


Answer: 


(a) xj = І хэ = : хз 25 І 
15] 0552 Labs 57 |—0.929 


(b) AM 228, А 2.976, AP а 2.997 
(c) Dominant eigenvalue: д = 3; dominant eigenvector: | | | 


(d) 0.196 


8. Repeat the directions of Exercise 7 with 


In Exercises 9—10, a matrix А with a dominant eigenvalue and a sequence хо, AXD, ..., A?xg are given. Use Formulas 


9 and 10 to approximate the dominant eigenvalue and a corresponding eigenvector. 


9„„_|12| ЕЕ БИ) а. P". 
‚кре pose и 


2.99993: pe 


1.00000 


11. Consider matrices 


where хү is a unit vector and g + Q. Show that even though the matrix А is symmetric and has a dominant 
eigenvalue, the power sequence 1 in Theorem 9.2.1 does not converge. This shows that the requirement in that 
theorem that the dominant eigenvalue be positive 1s essential. 


12. Use the power method with Euclidean scaling to approximate the dominant eigenvalue and a corresponding 
eigenvector of А. Choose your own starting vector, and stop when the estimated percentage error in the eigenvalue 
approximation 15 less than 0.1%. 


(аә [1 3 3 
3- db oed 
3 —1 10 

[1 0 1 1 
rei 
jx. 3] 
1 1 18 


13. Repeat Exercise 12, but this time stop when all corresponding entries in two successive eigenvector approximations 
differ by less than 0.01 in absolute value. 


Answer: 


(a) 1 
Starting with | 0 |, it takes 8 iterations. 

0 
(b) 


Starting with , It takes 8 iterations. 


оо о н 


14. Repeat Exercise 12 using maximum entry scaling. 
15. Prove: If А is a nonzero » x д matrix, then 47 4 and 447 have positive dominant eigenvalues. 
16. (For readers familiar with proof by induction) Let A be an y x » matrix, let хо be a unit vector in R”, and define 
the sequence x1, X2, ..., Xj, ... by 
y=- yy = og, = et 
|| Axql| || Ax; || || Axj,—1| 


Prove by induction that x, = 4" xp / || AFxoll 


17. (For readers familiar with proof by induction) Let A be an y x д matrix, let Xy be a nonzero vector in R”, and 
define the sequence x1, X5, ..., Xi, ... by 


ХІ = S , х2 = = £63 5778 Xk = = — , 
max( Axp) max( Ax) max(Axy 1) 
Prove by induction that 
xk AF xg 
max (Аха) 
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9.3 Internet Search Engines 


Early search engines on the Internet worked by examining key words and phrases in pages and titles of posted documents. Today's most popular search engines use algorithms 
based on the power method to analyze hyperlinks (references) between documents. In this section we will discuss one of the ways in which this is done. 


Google, the most widely used engine for searching the Internet, was developed in 1996 by Larry Page and Sergey Brin while both were graduate students at Stanford University. 
Google uses a procedure known as the PageRank algorithm to analyze how documents at relevant sites reference one another. It then assigns to each site a PageRank score, 
stores those scores as a matrix, and uses the components of the dominant eigenvector of that matrix to establish the relative importance of the sites to the search. 


Google starts by using a standard text-based search engine to find an initial set Sp of sites containing relevant pages. Since words can have multiple meanings, the set Sg will 
typically contain irrelevant sites and miss others of relevance. To compensate for this, the set 50 is expanded to a larger set 5 by adjoining all sites referenced by the pages in the 
sites of Sig. The underlying assumption is that S will contain the most important sites relevant to the search. This process is then repeated a number of times to refine the search 
information still further. 


To be more specific, suppose that the search set S contains n sites, and define the adjacency matrix for S to be the у x у matrix A= [ау] in which 
азр = 1if siteireferences site j 
а =  Üifsitei does пої reference site j 


We will assume that no site references itself, so the diagonal entries of А will all be zero. 


EXAMPLE 1 Adjacency Matrices + 


Here is a typical adjacency matrix for a search set with four sites: 
Referenced Site 
УЕ: 


Referencing Site (1) 


“eno 
ноо о 
ноо н 
оно н 
оу м н 


Thus, Site 1 references Sites 3 and 4, Site 2 references Site 1, and so forth. 


There are two basic roles that a site can play in the search process—the site may be a hub, meaning that it references many other sites, or it may be an authority, meaning that it 
is referenced by many other sites. A given site will typically have both hub and authority properties in that it will both reference and be referenced. 


Historical Note The term google is a variation of the word googol, which stands for the number 10100 (1 followed by 100 zeros). This term was invented by the 


American mathematician Edward Kasner (1878—1955) in 1938, and the story goes that it came about when Kasner asked his eight-year-old nephew to give a name to a 
really big number—he responded with “googol.” Kasner then went on to define a googolplex to be 10) googol (1 followed by googol zeros). 


In general, if А is an adjacency matrix for sites, then the column sums of А measure the authority aspect of the sites and the row sums of А measure their hub aspect. For 
example, the column sums of the matrix in 1 are 3, 1, 2, and 2, which means that Site 1 is referenced by three other sites, Site 2 is referenced by one other site, and so forth. 
Similarly, the row sums of the matrix in 1 are 2, 1, 2, and 3, so Site 1 references two other sites, Site 2 references one other site, and so forth. 

Accordingly, if А is an adjacency matrix, then we call the vector hg of row sums of A the initial hub vector of A, and we call the vector ag of column sums of A the initial 


authority vector of A. Alternatively, we can think of ап as the vector of row sums of 4 T. which turns out to be more convenient for computations. The entries in the hub vector 
are called hub weights and those in the authority vector authority weights. 


EXAMPLE 2 Initial Hub and Authority Vectors of an Adjacency Matrix + 


Find the initial hub and authority vectors for the adjacency matrix А in Example 1. 


Solution The row sums of A yield the initial hub vector 


2 

1 
hy = 

0— |Site 3 2 
3 


1 
2 |Site 3 6) 
2 


The link counting in Example 2 suggests that Site 4 is the major hub and Site 1 is the greatest authority. However, counting links does not tell the whole story; for example, it 
seems reasonable that if Site 1 is to be considered the greatest authority, then more weight should be given to hubs that link to that site, and if Site 4 is to be considered a major 


hub, then more weight should be given to sites to which it links. Thus, there is an interaction between hubs and authorities that needs to be accounted for in the search process. 
Accordingly, once the search engine has calculated the initial authority vector ag, it then uses the information in that vector to create new hub and authority vectors hy and a1 


using the formulas 


Аар 


ic АТЫ 
\|-Aagl| 


a1 = 
А7 


(4) 


The numerators in these formulas do ће weighting, and the normalization serves to control the size of the entries. To understand how the numerators accomplish the weighting, 
view the product Aag as a linear combination of the column vectors of A with coefficients from ag. For example, with the adjacency matrix in Example | and the authority vector 


calculated in Example 2 we have 


Referenced Site 


1234 
0 1 3 0 0 1 1 4 | Site 1 
a=] o o olji 1 0 0 0| |3|Site2 
=3 1 2 2 = 
1 00 1121211 40174041115 | site 3 
111012 1 1 1 0 6 | Site4 
Thus, we see that the links to each referenced site are weighted by the authority values in ag To control the size of the entries, the search engine normalizes Aag to produce the 
updated hub vector 
4 0.43133 | Site 1 
Aa 113 0.32350 | Site 2 ; 
hy = 740 = | 7 |e 262 New Hub Weight 
17 Mol ^ jg6|5 |7 | 0.53916 | site 3 ^ i 
6 0.64700 | Site 4 


The new hub vector hy can now be used to update the authority vector using Formula 4. The product АТЫ, performs the weighting, and the normalization controls the size: 


Referencing Site 


0133 
ать [9 11 1] [045125 0 1 1 1] [1.50966] Site 1 
0 0 0 1||0.22350 0 0 0 1| [0.64700 Site 2 
20.431330 | + 0.32350] © | + 0.53916| © | + 0.64700] 1 | = 
1 0 0 1|053916 il o|" o|* 1 | | 1.07833 | Site 3 
10 1 of] 0.64700 1 0 1 0| | 0.97049 | Site 4 
1.50966] [0.68889] Site 1 
Т t 
АТЫ 0 1 [064700 | | 0.29524 | Site 2 "m 
а= MT] ^ 219142 | 1.07833 |™| 0.49207 | Sito 3 ^" Authority Weights 
0.97049| | 0.44286 | Site 4 


Once the updated hub and authority vectors, hy and а], are obtained, the search engine repeats the process and computes a succession of hub and authority vectors, thereby 


generating the interrelated sequences 


hy =A hz = Aa hz = Aan +, hk йад... 
|| Aag]| 1.4: || || 4aql| Aaz- ll (5) 
P4 l d 1 7 l l 
Ah Ah ATh ATh 
a pe L, а zi cu a men a=, (6) 
I4 hill 4 hzll l4" 311 ПА hl 
However, each of these is a power sequence in disguise. For example, if we substitute the expression for hj, into the expression for ак, then we obtain 
Tí Аар] T 
= _ [Аа _ 4e 
1471 па (г) I (474)a-il 
Aaz- || 
which means that we can rewrite 6 as 
(4 Ajao (4 7ађа; (4 ТА\а, 1 
ap а; = , а = , = , (7) 
I (47 4)aol I (47 4a: {47 }ax—all 
Similarly, we can rewrite 5 as 
AAT, AAT hki 
Aag 
L7 Moll | T e 7 ' (8) 
0 п I (447 hil 


Remark In Exercise 15 of Section 9.2 you were asked to show that 47 4 and 447 both have positive dominant eigenvalues. That being the case, Theorem 9.2.1 ensures that 7 
and 8 converge to the dominant eigenvectors of 47 дапа 447, respectively. The entries in those eigenvectors are the authority and hub weights that Google uses to rank the 


search sites in order of importance as hubs and authorities. 


EXAMPLE 3 ARanking Procedure + 


Suppose that a search engine produces 10 Internet sites in its search set and that the adjacency matrix for those sites is 


Referenced Site 
123456789 10 


0100100100| 1 
000010000 0] 2 
000010000 0} 3 
A= D OE Оооо ша Referencing Site 
00000020100| 5 
0111100101| 6 
000000000 1| 7 
000010000 0] 8 
000001000 0] 9 
0000010000] 10 


Use Formula 7 to rank the sites in decreasing order of authority. 


Solution We will take ag to be the normalized vector of column sums of A, and then we will compute the iterates in 7 until the authority vectors seem to 
stabilize. We leave it for you to show that 


0 0 
2 0.27217 
1 0.13608 
1 0.13608 
Е E 5| | 0.68041 
ds |? 0.40825 
1 0.13608 
3 0.40825 
0 0 
2 0.27217 
and that 
0000000000 0 0 
0211202020 1/|/027217 3.26599 
011110010 1/|0.13608 1.90516 
011110010 1/0.13608 1.90516 
Ty, |9 2115 0 0 2 0 1 |[0.68041|) | 5.30723 
[ADI 000020231200 0|[040825| | 1.36083 
00000110 0 01/|0.3608 0.54433 
0211200 3 0 1]/ 040825 3.67423 
0000000000 0 0 
011110010 2 [0.27217 2.17732 
Thus, 
0 0 
3.26599 0.40056 
1.90516 0.23366 
P 1.90516 0.23366 
e" ( Jao 1__| 5.30723] __ | 0.65090 


, | P А]! 7815362 | 1.36083 |^ | 0.16690 
0.54433 | | 0.06676 
3.67423 | | 0.45063 
0 0 
2.17732] | 0.26704 


Continuing in this way yields the following authority iterates: 


(474 ap (474 ay (474 a2 (474 a3 (474 ag (4 T 4 ag 
20 = T а2 = T а= T m T. i T an=— m pras 
I (47 Аад! I (47 Aja: I (47A )aal I (47 4)asl I (47 A4)asl I (АТ Аа, 

0 0 0 0 0 0 0 Site 1 
0.27217 0.40056 0.41652 0.41918 0.41973 0.41990 0.41990 Site 2 
0.13608 0.23366 0.24917 0.25233 0.25308 0.25337 0.25337 Site 3 
0.13608 0.23366 0.24917 0.25233 0.25309 0.25337 0.25337 Site 4 
0.68041 0.65090 0.63407 0.62836 0.62665 m 0.62597 0.62597 Site 5 
0.40825 0.16690 0.06322 0.02372 0.00889 0.00007 0.00002 Site 6 
0.13608 0.06676 0.02603 0.00981 0.00368 0.00003 0.00001 Site 7 
0.40825 0.45063 0.46672 0.47050 0.47137 0.47165 0.47165 Site 8 

0 0 0 0 0 0 0 Site 9 
0.27217 0.26704 0.27892 0.28300 0.28416 0.28460 0.28460 Site 10 


The small changes between ag and ап suggest that the iterates have stabilized near a dominant eigenvector of 47 д. From the entries in аџо we conclude that Sites 
1, 6, 7, and 9 are probably irrelevant to the search and that the remaining sites should be searched in order of decreasing importance as 


Site 5, Site 8, Site 2, Site 10, Site 3 and 4(a tie) 


Concept Review 

* Adjacency matrix 

* Hub vector 

* Authority vector 

* Hub weights 

* Authority weights 

Skills 

* Find the initial hub and authority vectors of an adjacency matrix. 


* Use the method of Example 3 to rank sites. 


Exercise Set 9.3 


In Exercises 1—2, find the initial hub and authority vectors for the given adjacency matrix A. 


1. Referenced Site 
123 
001|1 Я А 
А= 10112 Referencing Site 
10 1] 3 
Answer: 
1 2 
= |2 |, ap—|0 
2 3 
2. Referenced Site 
1234 
0101|1 
A= |1 0 0 1| 2 Referencing Site 
1001] 3 
1110] 4 


In Exercises 3—4, find the updated hub and authority vectors һу and ау for the adjacency matrix A. 


3. The matrix in Exercise 1. 


Answer: 
0.39057 0.60971 
hy = | 0.65094 |, ay = 0 
0.65094 0.79262 


4. The matrix in Exercise 2. 


In Exercises 5—8, the adjacency matrix A of an Internet search engine is given. Use the method of Example 3 to rank the sites in decreasing order of authority. 


5. Referenced Site 
1234 
60010] 1 
A= |1 0 0 0| 2 Referencing Site 
110 0] 3 
0100] 4 
Answer: 


Sites 1 and 2 (tie); sites 3 and 4 are irrelevant 


6. Referenced Site 
1234 
01101 
A= |0 0 1 0| 2 Referencing Site 
1001] 3 
100 0] 4 


7. Referenced Site 


12345 
011 10|1 
1 90 05 132 р . 
A= 00 0-011 3 Referencing Site 
0100 0) 4 
Dol 19-8].5 
Answer: 


Site 2, site 3, site 4; sites 1 and 5 are irrelevant 
8. Referenced Site 
12345678 9 10 


0110110001; 1 
001000000 0] 2 
0000000001] 3 
0110011001; 4 
A= |0 0010000 0 0] 5 Referencing Site 
010000000 0] 6 
000000001 0] 7 
0000010000] 8 
0110010101; 9 
00000100 0 0] 10 
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9.4 Comparison of Procedures for Solving Linear 
oystems 


There is an old saying that “time is money.” This is especially true in industry where the cost of solving a linear 
system is generally determined by the time it takes for a computer to perform the required computations. This 
typically depends both on the speed of the computer processor and on the number of operations required by the 
algorithm. Thus, choosing the right algorithm has important financial implication in an industrial or research setting. 
In this section we will discuss some of the factors that affect the choice of algorithms for solving large-scale linear 
systems. 


Flops and the Cost of Solving a Linear System 


In computer jargon, an arithmetic operation (+, —, *, = ) on two real numbers is called a flop, which is an acronym 


for “floating-point operation." The total number of flops required to solve a problem, which is called the cost of the 
solution, provides a convenient way of choosing between various algorithms for solving the problem. When needed, 
the cost in flops can be converted to units of time or money if the speed of the computer processor and the financial 
aspects of its operation are known. For example, many of today's personal computers are capable of performing in 
excess of 10 gigaflops per second (1 gigaflop — 10? flops). Thus, an algorithm that costs 1,000,000 flops would be 


executed in 0.0001 seconds. 


To illustrate how costs (in flops) can be computed, let us count the number of flops required to solve a linear system 
of n equations in n unknowns by Gauss-Jordan elimination. For this purpose we will need the following formulas for 
the sum of the first n positive integers and the sum of the squares of the first n positive integers: 


м(и + 1) 


1+2+3+...+ 80 = 5 


(1) 


2_ nla n+ 1) 


2 2 2 
1*4-2* + 3* +...+ и г 


(2) 


Let Ах = Ъ be а linear system of n equations in n unknowns to be solved by Gauss-Jordan elimination (ог, 
equivalently, by Gaussian elimination with back substitution). For simplicity, let us assume that A is invertible and 
that no row interchanges are required to reduce the augmented matrix [Alb] to row echelon form. The diagrams that 
accompany the following analysis provide a convenient way of counting the operations required to introduce a 
leading 1 in the first row and then zeros below it. In our operation counts, we will lump divisions and multiplications 
together as “multiplications,” and we will lump additions and subtractions together as “additions.” 


Step 1. It requires л flops (multiplications) to introduce the leading 1 in the first row. 


b de х £9 g x . РЕ 

А . : Ы " . x denotes aquantity that is being computed. 

MN D оа E e denotes a quantity that is not being computed. 
ee ө e o° The augmented matrix size is м x (и + 1) . 


Step 2. It requires n multiplications and n additions to introduce a zero below the leading 1, and there аге » — 1 rows 
below the leading 1, so the number of flops required to introduce zeros below the leading 1 is 2:2 (»: = 1). 


Column 1. 


Column 2. 


Column 3. 


Total for all columns. 


Column 7. 


l e e °>; o e 
. 

Ох x *** x X|x 
Ох x ++ x x|X 
x 

5 х 

Ох x х = 
0 x x *** x x 


Combining Steps 1 and 2, the number of flops required for column 1 is 


+ 2n[n- 1) = 263 =n 


The procedure for column 2 is the same as for column 1, except that now we are 
dealing with one less row and one less column. Thus, the number of flops 
required to introduce the leading 1 in row 2 and the zeros below it can be obtained 
by replacing n by з — 1 in the flop count for the first column. Thus, the number of 
flops required for column 2 is 


2(4—1)%— (n7 1) 


By the argument for column 2, the number of flops required for column 3 is 


2(n-2))- (n-2) 


The pattern should now be clear. The total number of flops required to create the л 
leading 1's and the associated zeros is 


(2n?— а) 4 [20 1)? = (-1)] 4 [20-27 - ( -2) Bd (2-1) 


which we can rewrite as 
222 4 inci) +e 1]-|2 | (2-1) Кеч 1] 


or on applying Formulas 1 and 2 as 


днб 1) (2и 1) _ җщи+1) _ 2,3, 1,2 
6 2 О 


Next, let us count the number of operations required to complete the backward 
phase (the back substitution). 


It requires у — ] multiplications and з — 1 additions to introduce zeros above the 
leading 1 in the nth column, so the total number of flops required for the column 


is 2(и = 1). 


1 0 
х 
0 1 0| s 
0 1 0|х 
0 0 0 1 0|*^ 
. 


© 
© 
© 
—À 


Column (л — 1). The procedure is the same as for Step 1, except that now we are dealing with one 
less row. Thus, the number of flops required for the (м — 1)st column is 2(» = 2) 


le e-::: 00 

x 
01 e ++: 0 Olx 
00 1 +++ 0 0|х 
00 0 +++ 10/8 
00 0... 01 


Column (n — 2). By the argument for column {ж — 1), the number of flops required for column 
(я — 2) is 2(n — 3). 
Total. The pattern should now be clear. The total number of flops to complete the 
backward phase is 


2(n- 1)4 2-2) 4 2(n—3}+...4 2 (2 —n\=2| n? - (1424.4 2] 


which we can rewrite using Formula 1 as 


ipe - ROD | an? —n 


In summary, we have shown that for Gauss—Jordan elimination the number of flops required for the forward and 
backward phases is 


flops for forward phase = 2,3 | 1,2 = l, (3) 
3 2 6 
flops for backward phase = n —n (4) 


Thus, the total cost of solving a linear system by Gauss-Jordan elimination is 


flops for both phases — Sn | 2н? — en (5) 


Cost Estimates for Solving Large Linear Systems 


Itis a property of polynomials that for large values of the independent variable the term of highest power makes the 
major contribution to the value of the polynomial. Thus, for /arge linear systems we can use 3 and 4 to approximate 
the number of flops in the forward and backward phases as 


flops for forward phase zz Sn (6) 
flops for backward phase z n? (7) 


This shows that it is more costly to execute the forward phase than the backward phase for large linear systems. 


Indeed, the cost difference between the forward and backward phases can be enormous, as the next example shows. 


EXAMPLE 1 Costof Solving a Large Linear System + 


Approximate the time required to execute the forward and backward phases of Gauss-Jordan 
elimination for a system of 10,000 ( — 10“) equations in 10,000 unknowns using a computer that can 


execute 10 gigaflops per second. 


Solution We have » — 104 for the given system, so from 6 and 7 the number of gigaflops required 


for the forward and backward phases is 
3 
gigaflops for forward phase сш 293 х 10? = £ (104) х 107 = 2 x 10? 
2 
gigaflops for backward phase z« n? x 10? = (104) x 107? = 1071 


Thus, at 10 gigaflops/s the execution times for the forward and backward phases are 


time for forward phase zz ( x 0) x 1071 $266.67 s 


time for backward phase zz (1077) х 1071 sz 0.015 


We leave it as an exercise for you to confirm the results in Table 1. 
Table 1 


Approximate Cost for an y х у Matrix A with Large n 


Algorithm Cost in Flops 

Gauss-Jordan elimination (forward phase) пы 2,3 

3 
Gauss-Jordan elimination (backward phase) | = n? 
LU-decomposition of А aia 

5 
Forward substitution to solve ду = b — 
Backward substitution to solve Ux — y — 


АС} by reducing [A|] to |; 


47 ES 253 


Compute 47} zs nt 


Considerations in Choosing an Algorithm for Solving a Linear System 


For a single linear system 4x = h of n equations in n unknowns, the methods of LU-decomposition and Gauss— 
Jordan elimination differ in bookkeeping but otherwise involve the same number of flops. Thus, neither method has 
a cost advantage over the other. However, LU-decomposition has other advantages that make it the method of 
choice: 


* Gauss-Jordan elimination and Gaussian elimination both use the augmented matrix [Afb] ‚ so b must be known. 
In contrast, LU-decomposition uses only the matrix A, so once that decomposition is known it can be used with as 
many right-hand sides as are required, one at a time. 


* The LU-decomposition that is computed to solve Ах = h can be used to compute А -l if needed, with little 


› 


additional work. 


* For large linear systems in which computer memory is at a premium, one can dispense with the storage of the 1's 
and zeros that appear on or below the main diagonal of U, since those entries are known from the form of U. The 
space that this opens up can then be used to store the entries of L, thereby reducing the amount of memory 
required to solve the system. 


e IfA is a large matrix consisting mostly of zeros, and if the nonzero entries are concentrated in a “band” around the 
main diagonal, then there are techniques that can be used to reduce the cost of LU-decomposition, giving it an 
advantage over Gauss-Jordan elimination. 


The cost in flops for Gaussian elimination is the 
same as that for the forward phase of Gauss- 
Jordan elimination. 


Concept Review 

* Flop 

* Formula for the sum of the first n positive integers 

* Formula for the sum of the squares of the first n positive integers 
* Cost in flops for solving large linear systems by various methods 
* Cost in flops for inverting a matrix by row reduction 


* [ssues to consider when choosing an algorithm to solve a large linear system 


Skills 

* Compute the cost of solving a linear system by Gauss-Jordan elimination. 

* Approximate the time required to execute the forward and backward phases of Gauss-Jordan elimination. 
* Approximate the time required to find an LU-decomposition of a matrix. 


* Approximate the time required to find the inverse of an invertible matrix. 


Exercise Set 9.4 


1. A certain computer can execute 10 gigaflops per second. Use Formula 5 to find the time required to solve the 
system using Gauss-Jordan elimination. 


(а) А system of 1000 equations in 1000 unknowns. 
(b) A system of 10,000 equations in 10,000 unknowns. 
(c) А system of 100,000 equations in 100,000 unknowns. 


Answer: 


(a) = 0.067 second 
(b) = 66.68 seconds 


(c) = 66, 668 seconds, or about 18.5 hours 
. A certain computer can execute 100 gigaflops per second. Use Formula 5 to find the time required to solve the 
system using Gauss-Jordan elimination. 
(a) A system of 10,000 equations in 10,000 unknowns. 
(b) A system of 100,000 equations in 100,000 unknowns. 
(c) А system of 1,000,000 equations in 1,000,000 unknowns. 
. Today's personal computers can execute 70 gigaflops per second. Use Table 1 to estimate the time required to 
perform the following operations on the invertible 10,000 x 10,000 matrix A. 
(a) Execute the forward phase of Gauss-Jordan elimination. 
(b) Execute the backward phase of Gauss-Jordan elimination. 


(c) LU-decomposition of A. 
(d) Find 4-1 by reducing [AlZ] to L [4 » 


Answer: 


(а) = 9.52 seconds 
(b) = 0.0014 second 
(с) = 9.52 seconds 
(d) = 28.6 seconds 


- The IBM Roadrunner computer can operate at speeds in excess of 1 petaflop per second (1 petaflop — 101 


flops). Use Table 1 to estimate the time required to perform the following operations of the invertible 
100, 000 x 100, 000 matrix А. 


(a) Execute the forward phase of Gauss-Jordan elimination. 
(b) Execute the backward phase of Gauss-Jordan elimination. 


(c) д77-десотроѕійоп of A. 
(d) Find 4-1 by reducing [A|] to L [4 1] 


*(a) Approximate the time required to execute the forward phase of Gauss-Jordan elimination for a system of 
100,000 equations in 100,000 unknowns using a computer that can execute 1 gigaflop per second. Do the 
same for the backward phase. (See Table 1.) 


(b) How many gigaflops per second must a computer be able to execute to find the 7, 77-decomposition of a 
matrix of size 10,000 = 10,000 in less than 0.5 s? (See Table 1.) 


Answer: 


(a) 6.67 x 107 s for forward phase, 10 s for backward phase 
(b) 1334 


6. About how many teraflops per second must a computer be able to execute to find the inverse of a matrix of size 
100, 000 x 100, 000 in less than 0.5 s? (1 teraflop = 101? flops.) 


In Exercises 7-10, A and B are » x; у matrices and c is a real number. 
7. How many flops are required to compute c 4? 
Answer: 


x? flops 
8. How many flops are required to compute 4 + 5? 


9. How many flops are required to compute 48? 
Answer: 


Qn? — п? flops 


10. If A is a diagonal matrix and k is a positive integer, how many flops are required to compute 4%? 
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9.5 Singular Value Decomposition 


In this section we will discuss an extension of the diagonalization theory for »; x »; symmetric matrices to general 
т x и matrices. The results that we will develop in this section have applications to compression, storage, and 
transmission of digitized information and form the basis for many of the best computational algorithms that are 
currently available for solving linear systems. 


Decompositions of Square Matrices 
We saw in Formula 2 of Section 7.2 that every symmetric matrix A can be expressed as 


A=ppp? (1) 


where Р is an y x » orthogonal matrix of eigenvectors of A, and D is the diagonal matrix whose diagonal entries are 
the eigenvalues corresponding to the column vectors of P. In this section we will call 1 an eigenvalue 
decomposition of A (abbreviated EVD of A). 


If an x х у matrix A is not symmetric, then it does not have an eigenvalue decomposition, but it does have a 
Hessenberg decomposition 


А= РНР! 
in which P is an orthogonal matrix апа Н is in upper Hessenberg form (Theorem 7.2.4). 


Moreover, if A has real eigenvalues, then it has a Schur decomposition 


А= РЕР? 
in which P is an orthogonal matrix and < is upper triangular (Theorem 7.2.3). 


The eigenvalue, Hessenberg, and Schur decompositions are important in numerical algorithms not only because the 
matrices D, Н, and $ have simpler forms than А, but also because the orthogonal matrices that appear in these 
factorizations do not magnify roundoff error. To see why this is so, suppose that А is a column vector whose entries 
are known exactly and that 
^ 
x=x+e 


is the vector that results when roundoff error is present in the entries of ^ | 


If P is an orthogonal matrix, then the length-preserving property of orthogonal transformations implies that 
||Px — Px|| = lx — х|| = еј 

which tells us that the error in approximating P by Px has the same magnitude as the error in approximating & by 
x: 
There are two main paths that one might follow in looking for other kinds of decompositions of a general square 
matrix A: One might look for decompositions of the form 

A-pJp 
in which P is invertible but not necessarily orthogonal, or one might look for decompositions of the form 

А= UV T 


in which U and V are orthogonal but not necessarily the same. The first path leads to decompositions in which J is 
either diagonal or a certain kind of block diagonal matrix, called a Jordan canonical form in honor of the French 
mathematician Camille Jordan (see p. 510). Jordan canonical forms, which we will not consider in this text, are 
important theoretically and in certain applications, but they are of lesser importance numerically because of the 
roundoff difficulties that result from the lack of orthogonality in P. In this section we will focus on the second path. 


Singular Values 


Since matrix products of the form 47 4 will play an important role in our work, we will begin with two basic 
theorems about them. 


THEOREM 9.5.1 


If A is ап; x м matrix, then: 


(a) A and 47 4 have the same null space. 

(b) A and 47 4 have the same row space. 

(c) AT and 47 4 have the same column space. 
(d) A and 47 4 have the same rank. 


We will prove part (a) and leave the remaining proofs for the exercises. 
Proof (а) We must show that every solution of 4x = () is a solution of 47 Ах = 0, and conversely. If xg is any 
solution of Ах = Q, then хо is also a solution of 47 Ду = () since 

AT Axy = AT (Axo) = 470-0 


Conversely, if xg is any solution of 47 4x — 0, then Xy is in the null space of 47 4 and hence is orthogonal to all 
vectors in the row space of 47 4 by part (q) of Theorem 4.8.10. 


However, 47 4 is symmetric, so Xg is also orthogonal to every vector in the column space of 47 4. In particular, xg 


must be orthogonal to the vector (4 ТА )xo: that is, 
xp - (4 T A xo =0 
Using the first formula in Table 1 of Section 3.2 and properties of the transpose operation we can rewrite this as 
xj (47 4o = (Ах) " (Axo) = (Ако). (Axo) = IL4xoll? — 0 


which implies that Ахп = 0, thereby proving that Xg is a solution of Axg = 0. 


THEOREM 9.5.2 


If A is ап: x у matrix, then: 
(a) АТ Ais orthogonally diagonalizable. 


(b) The eigenvalues of 47 4 are nonnegative. 


Proof (a) The matrix 47 4, being symmetric, is orthogonally diagonalizable by Theorem 7.2.1. 


Proof (b) Since A7 4 is orthogonally diagonalizable, there is an orthonormal basis for R” consisting of 
eigenvectors of 47 4, say (v1, ¥2, -.., Vy} . If we let Ay, Аз, ..., Ay be the corresponding eigenvalues, then for 
1<;< и we have 
14%; |2 = Ау; - dvi — Vi- AT Avi [Formula (26) of Section 3.2 
— Vi: Ай = № vi | vi = мм? = м 


It follows from this relationship that А; > 0. 


DEFINITION 1 


If A is an у; x у matrix, and if Ay, Аз, ..., Ay are the eigenvalues of 47 4, then the numbers 


81 = үл, 0; = X... 84 = үм, 


are called the singular values of A. 


We will assume throughout this section that the 
eigenvalues of 47 4 are named so that 


Àj2À22..2A4,20 


and hence that 
01 >02>...>0,>0 


EXAMPLE 1 Singular Values + 


Find the singular values of the matrix 


— © = 
O = = 


Solution The first step is to find the eigenvalues of the matrix 


11 
т, [10 1 _[2 3 
ЫГ J 


The characteristic polynomial of 47 4 is 
= 4\43= (A- 3}(A— 1) 


so the eigenvalues of 47 д are Ау = 3 and X; = 1 and the singular values of А in order of decreasing 


ЕИ Ур acil 


size are 


Singular Value Decomposition 


Before turning to the main result in this section, we will find it useful to extend the notion of a “main diagonal" to 
matrices that аге not square. We define the main diagonal of an з x у matrix to be the line of entries shown in 
Figure 9.5.1 —it starts at the upper left corner and extends diagonally as far as it can go. We will refer to the entries 
on the main diagonal as the diagonal entries. 


Main diagonal 


Figure 9.5.1 


We are now ready to consider the main result in this section, which is concerned with a specific way of factoring a 
general зр x » matrix A. This factorization, called singular value decomposition (abbreviated SVD) will be given in 
two forms, a brief form that captures the main idea, and an expanded form that spells out the details. The proof is 
given at the end of this section. 


THEOREM 9.5.3 Singular Value Decomposition 


If A is an jj x » matrix, then A can be expressed in the form 

А= ШУТ 
where U апа V are orthogonal matrices and Х is an jj; x » matrix whose diagonal entries are the singular 
values of A and whose other entries are zero. 


Harry Bateman (1882—1946) 


Historical Note The term singular value is apparently due to the British-born mathematician Harry 
Bateman, who used it in a research paper published in 1908. Bateman emigrated to the United States in 
1910, teaching at Bryn Mawr College, Johns Hopkins University, and finally at the California Institute of 
Technology. Interestingly, he was awarded his Ph.D. in 1913 by Johns Hopkins at which point in time he 
was already an eminent mathematician with 60 publications to his name. 

[[таге: Courtesy of the Archives, California Institute of Technology| 


THEOREM 9.5.4 Singular Value Decomposition (Expanded Form) 


If A is ап; x » matrix of rank k, then A can be factored as 


vi 
c, 0 e 0 vi 
; 0 o "** 0 Jo s 
A=UEYT=[u wj --- шышы ccc ua]? i E vi 
0 0 

7k Dim-kyxii-I) || vi, 

Ост) ха | 

vn 


in which U, X, and V have sizes m x #2, #2 x n, and» x, respectively, and in which 

(a) V =[¥1 V2 ... Vx] orthogonally diagonalizes д T 4. 

(b) The nonzero diagonal entries of X are gy, = "I тз = үлэ. „ок = үл, Where Ду, Ag, ..., Ag are the 
nonzero eigenvalues of 47 4 corresponding to the column vectors of V. 


(c) The column vectors of V are ordered so that aj 7 05 7... o& 0. 


@ acc&Su la h 
ч || Ае; || ог Ач і= 1,2,..,& 


(e) {uy, 03, -.., uk) is an orthonormal basis for col(A)}. 


(f) (uj,u2,.., Uk, Ug 1,1, ... Uy) Is an extension of (щу, uz, ..., ux) to an ortho-normal basis for R™. 


The vectors uj, uz, ..., uj, are called the left 
singular vectors of A, and the vectors 

V1, V3, ..., Vj, are called the right singular vectors 
of A. 


EXAMPLE 2 Singular Value Decomposition if Als Not Square <4 


Find a singular value decomposition of the matrix 


1 1 
А= [0 1 
10 


Solution We showed in Example 1 that the eigenvalues of 47 4 are 4, = 3 and X; = 1 and that the 
corresponding singular values of A are gy, = үз and сэ = 1. We leave it for you to verify that 


ү? 42 


2 2 


d = 
and v3 Fa 


2 2 


are eigenvectors corresponding to Ay and Аз, respectively, and that = [v; |2] orthogonally 
diagonalizes 47 4. From part (d) of Theorem 9.5.4, the vectors 


are two of the three column vectors of U. Note that uj and uz are orthonormal, as expected. We could 
extend the set (uj, шз) to an orthonormal basis for 22. However, the computations will be easier if 


we first remove the messy radicals by multiplying щ and uz by appropriate scalars. Thus, we will look 
for a unit vector u5 that is orthogonal to 


2 0 
"E =|1| and "E —|-1 
1 1 


To satisfy these two orthogonality conditions, the vector u3 must be a solution of the homogeneous 


linear system 
X1 
oa lgj 
0 —1 1 X4 0 


We leave it for you to show that a general solution of this system is 


Xi —1 
х2|=!| 1 
X3 1 


Normalizing the vector on the right yields 


уб KE 

3 үз 

oi} [46.42 a 
10 6 2 үз 
уб ES 

6 үз 

У 


2 


Eugenio Beltrami (1835—1900) 


Gene Н. Golub (1932—) 


Historical Note The theory of singular value decompositions can be traced back to the work of five 
people: the Italian mathematician Eugenio Beltrami, the French mathematician Camille Jordan, the English 
mathematician James Sylvester (see p. 34), and the German mathematicians Erhard Schmidt (see p. 360) 
and the mathematician Herman Weyl. More recently, the pioneering efforts of the American mathematician 
Gene Golub produced a stable and efficient algorithm for computing it. Beltrami and Jordan were the 
progenitors of the decomposition— Beltrami gave a proof of the result for real, invertible matrices with 
distinct singular values in 1873. Subsequently, Jordan refined the theory and eliminated the unnecessary 
restrictions imposed by Beltrami. Sylvester, apparently unfamiliar with the work of Beltrami and Jordan, 
rediscovered the result in 1889 and suggested its importance. Schmidt was the first person to show that the 
singular value decomposition could be used to approximate a matrix by another matrix with lower rank, 
and, in so doing, he transformed it from a mathematical curiosity to an important practical tool. Weyl 
showed how to find the lower rank approximations in the presence of error. 

[/mages: wikipedia (Beltrami); The Granger Collection, New York (Jordan); Courtesy Electronic Publishing 
Services, Inc., New York City (Weyl; wikipedia (Golub)] 


OPTIONAL 


We conclude this section with an optional proof of Theorem 9.5.4. 


Proof of Theorem 9.5.4 For notational simplicity we will prove this theorem in the case where A is an у ху 


matrix. To modify the argument for an js x 5; matrix you need only make the notational adjustments required to 
account for the possibility that р; = у Or и => ри. 


The matrix 47 4 is symmetric, so it has an eigenvalue decomposition 
АТА= VDVT 
in which the column vectors of 
V = [vi[vaL [va] 
are unit eigenvectors of 47 4, and D is a diagonal matrix whose successive diagonal entries Ау, Az, ..., Ay, are the 
eigenvalues of 47 4 corresponding in succession to the column vectors of j/ Since A is assumed to have rank k, it 


follows from Theorem 9.5.1 that 47 4 also has rank k. It follows as well that D has rank k, since it is similar to 47 4 
and rank is a similarity invariant. Thus, D can be expressed in the form 


M 0 
À3 
D= № (2) 
0 
0 0 
where Ay > Aj >... > Ар > 0. Now let us consider the set of image vectors 
(Avi, Av, ..., Avy} (3) 


This is an orthogonal set, for if? # j, then the orthogonality of V; and Vj implies that 
T 
Av; - Av; =v ;- A Av; =v - Аруу = Xj v, vj) =0 

Moreover, the first k vectors іп 3 are nonzero since we showed in the proof of Theorem 9.5.25 that || Aw; || 2_ A, for 
i — 1, 2,..., 4, and we have assumed that the first k diagonal entries in 2 are positive. Thus, 

S= (dvi, Avg, ..., Avi} 
is an orthogonal set of nonzero vectors in the column space of A. But the column space of A has dimension К since 

rank (4) - rank (47 4) =k 
and hence S, being a linearly independent set of k vectors, must be an orthogonal basis for col(A). If we now 
normalize the vectors in S, we will obtain an orthonormal basis {uy, из, ..., uj) for col(4) in which 


Av; 1 PENA 
u = -i = az 1<;<& 
ТА у" | = | 


or, equivalently, in which 


Av, = үлүш =o1U1, Ay, = ү Agu? = 03023, ..., Avy = ү Aye = OU; 


It follows from Theorem 6.3.6 that we can extend this to an orthonormal basis 
(uy, U2, ..., ug, Uk. 11, ..., Uy} 
for R". Now let U be the orthogonal matrix 
U= [ш 02 .. Uk Uk] .. Чу] 


and let X, be the diagonal matrix 


81 0 
92 
E= Ok 
0 
0 0 
It follows from 4, and the fact that Av; = 0 for; = x, that 
UE = [оүщ оиз ... оки; 0 „©. Ut] 
= [4v Avg ... Ауд Ауру ... Ауу] 


AV 
which we can rewrite using the orthogonality of Vas 4 — тууу]. 


Concept Review 
* Eigenvalue decomposition 
* Hessenberg decomposition 


* Schur decomposition 


Magnification of roundoff error 


Properties that A and 47 4 have in common 
• 47 Ais orthogonally diagonalizable 


* Eigenvalues of 47 4 are nonnegative 


Singular values 
* Diagonal entries of a matrix that is not square 


* Singular value decomposition 


Skills 
* Find the singular values of an jy; х з matrix. 


* Find a singular value decomposition of an jz x »; matrix. 


Exercise Set 9.5 


(4) 


In Exercises 1—4, find the distinct singular values of 4 . 
1.4= [1 2 0] 


Answer: 


In Exercises 5-12, find a singular value decomposition of A. 


5 , [1 -1 
E 


Answer: 


Answer: 


| 
Ww w|— UJ [r5 
= © | 
ЗЕР 
" UJ 
у 


© 


13. Prove: If A is an jz x ; matrix, then 47 4 and 447 have the same rank. 
14. Prove part (d) of Theorem 9.5.1 by using part (a) of the theorem and the fact that A and 47 4 have n columns. 
2 (а) ргоуе part (b) of Theorem 9.5.1 by first showing that row (4 74) is a subspace of row(A). 

(b) Prove part (c) of Theorem 9.5.1 by using part (b). 


16. Let T: R” — R" be a linear transformation whose standard matrix A has the singular value decomposition 
А= ЖУТ, and let B= (v4, v3,..., v4) and 2 = fut, 03, ..., Um } be the column vectors of V and U, 
respectively. Show that £ = [7] g' в. 


17. Show that the singular values of 47 4 are the squares of the singular values of А. 


18. Show that if 4 — ттуу T is a singular value decomposition of A, then U orthogonally diagonalizes AAT. 
True-False Exercises 

In parts (a)-(g) determine whether the statement is true or false, and justify your answer. 

(a) If A is an ру x у matrix, then 47 4 is an jj x ру matrix 


Answer: 


False 


(b) If A is an у x x matrix, then 47 4 is a symmetric matrix. 
Answer: 


True 


(с) If A is ап; x x matrix, then the eigenvalues of 47 4 are positive real numbers. 


Answer: 


False 


(d) If A is an » x » matrix, then A is orthogonally diagonalizable. 
Answer: 


False 


(e) If A is an р x у matrix, then 47 4 is orthogonally diagonalizable. 


Answer: 


True 


(f) The eigenvalues of 47 4 are the singular values of A. 


Answer: 


False 


(g) Every + x м matrix has a singular value decomposition. 
Answer: 


True 
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9.6 Data Compression Using Singular Value Decomposition 


Efficient transmission and storage of large quantities of digital data has become a major problem in our technological world. In this section 
we will discuss the role that singular value decomposition plays in compressing digital data so that it can be transmitted more rapidly and 
stored in less space. We assume here that you have read Section 9.5 . 


Reduced Singular Value Decomposition 


Algebraically, the zero rows and columns of the matrix X, in Theorem 9.5.4 are superfluous and can be eliminated by multiplying out the 
expression г/у 7 using block multiplication and the partitioning shown in that formula. The products that involve zero blocks as factors 


drop out, leaving 


T 
o, 0 0] 
0 see 0 Т 
А= [ш ш ux] кр MEM (1) 
0 0 Ok x? 


which is called a reduced singular value decomposition of A. In this text we will denote the matrices on the right side of 1 by #7, £4, and 
yi , respectively, and we will write this equation as 


А= ШЖ] Q) 


Note that the sizes of 77, £1, and V7 are jg x k, , and & x д. respectively, and that the matrix X4 is invertible, since its diagonal 
1,24 1 тх kxk Кох п, Tesp у. 1 


entries are positive. 
If we multiply out on the right side of 1 using the column-row rule, then we obtain 
A—oc,ujvi | c3u3v1 H... скаку (3) 


which is called a reduced singular value expansion of A. This result applies to a// matrices, whereas the spectral decomposition [Formula 
7 of Section 7.2] applies only to symmetric matrices. 


Remark It can be proved that an j x у matrix M has rank 1 if and only if it can be factored as M — uy”, where ц is a column vector in 
R™ and V is a column vector іп R”. Thus, a reduced singular value decomposition expresses a matrix A of rank К as a linear combination 
of k rank 1 matrices. 


EXAMPLE 1 Reduced Singular Value Decomposition — 


Find a reduced singular value decomposition and a reduced singular value expansion of the matrix 


11 
А=|0 1 
1 0 


Solution In Example 2 of Section 9.5 we found the singular value decomposition 


465 1 
3 үз 
2 у 
T ; gg 1|[(P 1/2 2 
T 6 2 B|]? IE р (4) 
9 0] | = 
6 p 1 
6 2 үз 
A = U X yt 


Since A has rank 2 (verify), it follows from 1 with д — 2 that the reduced singular value decomposition of A corresponding 
to 4 is 


= Oe 
D = н 
Il 
| 
мю 
г —— 
UJ 
о 
LILÁA—— 
N 
№ 


This yields the reduced singular value expansion 


3 
1 1 2 
0 1| = отш] eom = 48 | 2 £|«o RÀ 2 = 2| 
10 6 2 2 B 2 2 
6 2 
6 
үз үз 
3 3 0 0 
1 1 
- 3) 2 B $722 
1 1 
үз үз 2 3 
6 6 


Note that the matrices in the expansion have rank 1, as expected. 


Data Compression and Image Processing 


Singular value decompositions can be used to “compress” visual information for the purpose of reducing its required storage space and 
speeding up its electronic transmission. The first step in compressing a visual image is to represent it as a numerical matrix from which the 
visual image can be recovered when needed. 


For example, a black and white photograph might be scanned as a rectangular array of pixels (points) and then stored as a matrix А by 
assigning each pixel a numerical value in accordance with its gray level. If 256 different gray levels are used (0 = white to 255 = black), 
then the entries in the matrix would be integers between 0 and 255. The image can be recovered from the matrix А by printing or 
displaying the pixels with their assigned gray levels. 


Original Reconstruction 


Historical Note In 1924 the U.S. Federal Bureau of Investigation (FBI) began collecting fingerprints and handprints and now 
has more than 30 million such prints in its files. To reduce the storage cost, the FBI began working with the Los Alamos National 
Laboratory, the National Bureau of Standards, and other groups in 1993 to devise rank based compression methods for storing 
prints in digital form. The following figure shows an original fingerprint and a reconstruction from digital data that was 
compressed at a ratio of 26:1. 


If the matrix A has size j x ņ, then one might store each of its рузу; entries individually. An alternative procedure is to compute the reduced 
singular value decomposition 


А= суш»! | omw oct скаку (5) 
in which 0 202 2...2 ср, and store the g's, the y's, and ће y's. 


When needed, the matrix A (and hence the image it represents) can be reconstructed from 5. Since each Ч; has m entries and each Vj has n 
entries, this method requires storage space for 


km -Ekn + k = k(m +n + 1) 


numbers. Suppose, however, that the singular values о>, ..., у, are sufficiently small that dropping the corresponding terms in 5 
produces an acceptable approximation 


A, =ou | omw? Hered oyu,ve (6) 


to A and the image that it represents. We call 6 the rank r approximation of A. This matrix requires storage space for only 
nm-nmm-r-—rm-44231) 


numbers, compared to зр numbers required for entry-by-entry storage of A. For example, the rank 100 approximation of a 1000 x 1000 
matrix А requires storage for only 


100(1000 + 1000 + 1) = 200, 100 


numbers, compared to the 1,000,000 numbers required for entry-by-entry storage of A—a compression of almost 80%. 


Figure 9.6.1 shows some approximations of a digitized mandrill image obtained using 6. 


w 


Rank 4 Rank 10 Rank 20 


=ù -— 


Rank 50 


Rank 128 


Figure 9.6.1 


Concept Review 
* Reduced singular value decomposition 
* Reduced singular value expansion 


* Rank of an approximation 


Skills 
* Find the reduced singular value decomposition of an р ж д matrix. 


* Find the reduced singular value expansion of an jy ху. 


Exercise Set 9.6 


In Exercises 1—4, find a reduced singular value decomposition of А. [Note: Each matrix appears in Exercise Set 9.5, where you were 
asked to find its (unreduced) singular value decomposition.] 


1. —2 2 
А= | –1 1 
2 —2 

Answer: 


ok Ok D- 
oon E + 


К 
Il 
м о = 


In Exercises 5-8, find a reduced singular value expansion of А. 
5. The matrix А in Exercise 1. 


Answer: 


342 


А 
БҮ 
EL 


+ 


WIN bole wii 


6. The matrix A in Exercise 2. 


7. The matrix A in Exercise 3. 


Answer: 


8. The matrix A in Exercise 4. 


9. Suppose A is a 200 x 500 matrix. How many numbers must be stored in the rank 100 approximation of 4? Compare this with the 
number of entries of A. 


Answer: 
70,100 numbers must be stored; А has 100,000 entries 


True-False Exercises 


In parts (a)—(c) determine whether the statement is true or false, and justify your answer. Assume that ШЕ] is a reduced singular 


value decomposition of an jz; x »; matrix of rank k. 
(a) 17; has size р x &- 
Answer: 


True 


(b) E, has size & x x. 
Answer: 


True 


(c) V, has size & x м. 
Answer: 


False 
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Chapter 9 Supplementary Exercises 


' Find an LU-decomposition of A = | n: | 


Answer: 


2 0|| -3 1 
=2 1 0 2 
. Find the LDU-decomposition of the matrix А in Exercise 1. 


; 246 
Find an LU-decomposition of 4 — | 1 4 7 |. 
137 


Answer: 
200123 
12011012 
11200 1 
. Find the LDU-decomposition of the matrix А in Exercise 3. 


"Let A= [ ‚| and xg = B 


(a) Identify the dominant eigenvalue of А and then find the corresponding dominant unit eigenvector y 
with positive entries. 


(b) Apply the power method with Euclidean scaling to А and Xp, stopping at X5. Compare your value of 
X5 to the eigenvector y found in part (а). 


(c) Apply the power method with maximum entry scaling to А and хо, stopping at X5. Compare your 


result with the eigenvector | | | 


Answer: 


(a) 


Ы). [07100] ... [07071 
577103041" ^ | 0.7071 


1 
mon 


. Consider the symmetric matrix 


Discuss the behavior of the power sequence 
X), X|,... Xk, --- 


with Euclidean scaling for a general nonzero vector хо. What is it about the matrix that causes the 
observed behavior? 


7. Suppose that a symmetric matrix A has distinct eigenvalues Ay = 8, Аз = 1.4, Аз = 2.3, and А = = 8.1. 
What can you say about the convergence of the Rayleigh quotients? 


е Find a singular value decomposition of A = | A. | 


1 1 
9. 1 1 
Find a singular value decomposition of 4 = |0 0 |. 
1 1 
Answer: 
sil joke 


10. Find a reduced singular value decomposition and a reduced singular value expansion of the matrix A in 
Exercise 9. 


11. Find the reduced singular value decomposition of the matrix whose singular value decomposition is 


AL dod ou 
2; 3 - ud ды od» o2 
1. dq zi rper 0] eco "d 
4212 2 2 27430: 22: ЖИЕ ЧОЕ! 
i- p. ЕЕ Е t6 3 
2 2 2 2| o 00| 1 2 2 
lo dioi 3 3 3 
оаа 
Answer: 
l. à 
2 2 
12 0 6 П ГО Е 
4 -810| |2 224 013 3 3 
bo apa e quis 2 1 
12 0 6 2 2 5 3 3 
E! 
2 2 


12. Do orthogonally similar matrices have the same singular values? Justify your answer. 


13. If P is the standard matrix for the orthogonal projection of R” onto a subspace W, what can you say about 
the singular values of P? 
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INTRODUCTION 


This chapter consists of 20 applications of linear algebra. With one clearly marked 


exception, each application is in its own independent section, so sections сап be deleted or 
permuted as desired. Each topic begins with a list of linear algebra prerequisites. 


Because our primary objective in this chapter is to present applications of linear algebra, 


proofs are often omitted. Whenever results from other fields are needed, they are stated 
precisely, with motivation where possible, but usually without proof. 
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10.1 Constructing Curves and Surfaces Through 
opecified Points 


In this section we describe a technique that uses determinants to construct lines, circles, and general conic 
sections through specified points in the plane. The procedure is also used to pass planes and spheres in 3-space 
through fixed points. 


Prerequisites 


Linear Systems 
Determinants 


Analytic Geometry 


The following theorem follows from Theorem 2.3.8. 


THEOREM 10.1.1 


A homogeneous linear system with as many equations as unknowns has a nontrivial solution if and only 
if the determinant of the coefficient matrix is zero. 


We will now show how this result can be used to determine equations of various curves and surfaces through 
specified points. 
A Line Through Two Points 
Suppose that (x1, y1) and (хэ, уз) are two distinct points in the plane. There exists a unique line 

сух сэу сз = 0 (1) 
that passes through these two points (Figure 10.1.1). Note that €1, с2, апа ¢3 are not all zero and that these 
coefficients are unique only up to a multiplicative constant. Because (х ү, y1) and (x, уз) lie on the line, 


substituting them in 1 gives the two equations 


cix #e2yvy +63 = 0 (2) 


cix? --C2y2 + сз = 0 (3) 


Figure 10.1.1 


The three equations, 1, 2, and 3, can be grouped together and rewritten as 


xey+yeg+ce3 = 0 
xic tyicga4+e3 = 0 
X20, +.у262 +с3 = 0 


which is a homogeneous linear system of three equations for £1, 2, and сз. Because ^1, 22, and ©з are not all 
Zero, this system has a nontrivial solution, so the determinant of the coefficient matrix of the system must be 
zero. That is, 


"P 1 
x; yı 1|=0 (4) 
x2 Уз 1 


Consequently, every point (x, у) on the line satisfies 4; conversely, it сап be shown that every point (x, у) that 
satisfies 4 lies on the line. 


EXAMPLE 1 Equation ofa Line + 
Find the equation of the line that passes through the two points (2, 1) and (3, 7). 


Solution Substituting the coordinates of the two points into Equation 4 gives 


x у 1 
21 1|=0 
371 


The cofactor expansion of this determinant along the first row then gives 


=x +y +11=0 


A Circle Through Three Points 


Suppose that there are three distinct points in the plane, (х, у), (x2, уз) and (хз, уз), not all lying on a 
straight line. From analytic geometry we know that there is a unique circle, say, 


cil? + y^) Hea + с3у Heg = 0 (5) 


that passes through them (Figure 10.1.2). Substituting the coordinates of the three points into this equation gives 


ed YD + сах + езу + c4—0 (6) 
2 2 — 

с1(х5 + y2) + c3x2 + €3y2 + с4 = 0 (7) 

с1(х2 + y2) + egxa + eaya +с4=0 (8) 


As before, Equations 5 through 8 form a homogeneous linear system with a nontrivial solution for с], €2, ¢3, 
and c4. Thus the determinant of the coefficient matrix is zero: 

x? 4 у? x у 1 

2 2 

Xi + yi X1 УІ 1 
2,.2 =" (9) 
хэ +y% x2 y2 1 

2 2 

хз Куз хз уз 1 


This is а determinant form for the equation of ће circle. 


Figure 10.1.2 


EXAMPLE 2 Equation of a Circle + 
Find the equation of the circle that passes through the three points (1, 7), (6, 2), and (4, 6). 


Solution Substituting the coordinates of the three points into Equation 9 gives 
x? + у? xy 1 
50 1 7T ian 
40 6 2 1 
52 46 1 
which reduces to 
10(x? + y^) — 20x — 40y — 200 =0 


In standard form this is 


(к=Лу + (y—2)7=5? 


Thus the circle has center (1, 2) and radius 5. 


A General Conic Section Through Five Points 


In his momumental work Principia Mathematica, Issac Newton posed and solved the following problem (Book 
I, Proposition 22, Problem 14): *To describe a conic that shall pass through five given points." Newton solved 
this problem geometrically, as shown in Figure 10.1.3, in which he passed an ellipse through the points A, B, D, 
P, C; however, the methods of this section can also be applied. 


Figure 10.1.3 


The general equation of a conic section in the plane (a parabola, hyperbola, or ellipse, or degenerate forms of 
these curves) is given by 


cix? t coxy + cay? FCaAx +e5y + с6 = 0 
This equation contains six coefficients, but we can reduce the number to five if we divide through by any one of 
them that is not zero. Thus only five coefficients must be determined, so five distinct points in the plane are 


sufficient to determine the equation of the conic section (Figure 10.1.4). As before, the equation can be put in 
determinant form (see Exercise 7): 


x? xy y? x y» 1 
2 2 1 
XQ ХУІ Fy Х| 1d 

2 2 

x x22 Уз Х2 y2 1 

=0 (10) 

2 2 

хз хзуз уз хз уз 1 
2 2 

X4 X4y4 y4 X4 y4 1 
2 2 1 


х5 X575 Js X5 у5 


Figure 10.1.4 


EXAMPLE 3 Equation of an Orbit — 


An astronomer who wants to determine the orbit of an asteroid about the Sun sets up a Cartesian 
coordinate system in the plane of the orbit with the Sun at the origin. Astronomical units of 
measurement are used along the axes (1 astronomical unit — mean distance of Earth to Sun = 93 
million miles). By Kepler's first law, the orbit must be an ellipse, so the astronomer makes five 
observations of the asteroid at five different times and finds five points along the orbit to be 


(8.025, 8.310), (10.170, 6.355), (11.202, 3.212), (10.736, 0.375), (9.092, — 2.267) 
Find the equation of ће orbit. 


Solution Substituting the coordinates of the five given points into 10 and rounding to three 
decimal places give 


x? xy у? x y 


1 
64401 66.688 69.056 8025 8.310 1 
103.429 64.630 40.386 10.170 6.355 1|—0 
125.485 35.981 10.317 11.202 3.212 1 
115.262 4.026 0.141 10.736 0.375 1 
82.664 —20.612 5.139 9.092 2.267 1 


The cofactor expansion of this determinant along the first row yields 
386.802x? — 102.895xy + 446.029у? — 2476.443x = 1427.998у — 17109.375 = 0 


Figure 10.1.5 is an accurate diagram of the orbit, together with the five given points. 


(8.025, 8.310) 


E (10.170, 6.355) 
6 

: (11.202, 3.212) 
E (10.736, 0.375) 
0 

2 

4 (9.092, 2.267) 

-6 


6-4-2 02 4 6 8 1012 14 16 18 20 22 


Figure 10.1.5 


A Plane Through Three Points 


In Exercise 8 we ask you to show the following: The plane in 3-space with equation 
cix + C2y -- C32 + сд — 0 
that passes through three noncollinear points (xi, Y 1, Z1)» (x2, ¥2, 22)» and (х3, ¥3, Z3) is given by the 
determinant equation 
x y z 
X1 Y1 71 
х2 Уз 22 
X3 УЗ Z3 


=0 (11) 


4 pi pi а 


EXAMPLE 4 Equation of a Plane + 


The equation of the plane that passes through the three noncollinear points (1, 1, 0), (2, 0, — 1), 
and (2, 9, 2) is 


= мз мз ы 
| 
o 


which reduces to 
2х = у + 3z—1=0 


А Sphere Through Four Points 


In Exercise 9 we ask you to show the following: The sphere in 3-space with equation 
ci(x? y? +24) сах 4 сзу Heg +с5 = 0 


that passes through four noncoplanar points (х, y1, 21), (x2, Y2, 22), (х3, Y3, 23), and (x4, уд, z4) is given 
by the following determinant equation: 


x? + y? +24 x у z 1 


2 2 2 

xptyptz x) ур zı 1 

d uda 7d —0 

х5 +5425 x3 уз z3 1|= 12 
2 2 2 

2 2 2 

хз +у3 +23 хз уз 23 1 


2 2 2 
ха y4 +24 X4 y4 24 1 


EXAMPLE 5 Equation of a Sphere + 


The equation of the sphere that passes through the four points (0, 3, 2), (1, — 1, 1), (2, 1, 0), 
and (5, 1, 3) is 


xy +22 x yzl 
13 0 321 
3 lj 1-5 
5 2. 101 
35 5 131 


This reduces to 
2 2 2 “n 
х +y” 27 —4х — 2у — bz +5 = 0 
which in standard form is 


(х—2)2+ (y 217 + @—3)2=9 


Exercise Set 10.1 


1. Find the equations of the lines that pass through the following points: 
(а) (1, = 1), (2, 2) 
(py (D, 1), (1, = T) 


Answer: 
(a) y= 3x —4 
(b) y= = 2х + 1 


2. Find the equations of the circles that pass through the following points: 
(a) (2, 6), (2, 0), (5, 3) 
(b) (4, —2), (3, 5), ( —4, 6) 


Answer: 
(a) x^ + y? 4х — 6y +4 =0 or (x -2)) + (y - 3)? =9 
(b) x? y? + 2x —4y — 20 =0 ог (x +1)? + (y — 2)? = 25 


3. Find the equation of the conic section that passes through the points (0, 0), (0, — 1), (2, 0), (2, — 5), and 
(4, — 1). 


Answer: 


x? + 2ху + у? — 2x + y = 0 (a parabola) 


4. Find the equations of the planes in 3-space that pass through the following points: 
(а) 1.1, —3),(1, — 1, 1), (0, = 1,2) 


(b) (2,3,1), (2,-1,-1) (1,2,1) 
Answer: 


(а) x+2y+z=0 
(b =x + y—2z+1=0 
5. (a) Alter Equation 11 so that it determines the plane that passes through the origin and is parallel to the plane 
that passes through three specified noncollinear points. 


(b) Find the two planes described in part (a) corresponding to the triplets of points in Exercises 4(a) and 4(b). 
Answer: 


(a) |x у z 0 
ge ed ЛЫ e 
x2 X2 22 1 


хз уз 23 1 
(b) x --2y +2= 0; =x + у 22 0 
6. Find the equations of ће spheres in 3-space that pass through the following points: 
(а) (1, 2,3), (— 1, 2, 1), (1,0, 1), (1, 2, = 1) 
(6): (0, T, = 2), (1,2, 1), (2, — 1, 0), (2,1, =1) 


Answer: 


(a) x2 py? tz - 2x - Ay - 222 —2 ог (x= 1)2+ (y — 2)2+ (z- 1)? 94 
(b) x?4 у? | 22 — 2x 2 2y =3 or (x 2 1? | (›—1)? pz? =5 
7. Show that Equation 10 is the equation of the conic section that passes through five given distinct points in the 


plane. 


8. Show that Equation 11 is the equation of the plane in 3-space that passes through three given noncollinear 
points. 


9. Show that Equation 12 is the equation of the sphere in 3-space that passes through four given noncoplanar 
points. 


10. Find a determinant equation for the parabola of the form 
с1у + сэх? Б сзх + сд = 0 


that passes through three given noncollinear points in the plane. 


Answer: 


у x" X 
wal Ж x, 1 =ü 
уЗ B х2 1 i 
y3 B x3 1 


11. What does Equation 9 become if the three distinct points are collinear? 
Answer: 


The equation of the line through the three collinear points 


12. What does Equation 11 become if the three distinct points are collinear? 


Answer: 


0= 0 


13. What does Equation 12 become if the four points аге coplanar? 
Answer: 


The equation of the plane through the four coplanar points 
Section 10.1 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the relevant 
documentation for the particular utility you are using. The goal of these exercises is to provide you with a basic 
proficiency with your technology utility. Once you have mastered the techniques in these exercises, you will be 
able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. The general equation of a quadric surface is given by 
aix? | азу? | аз? Баду + a5xz--agyz + ах + agy + ад2 + ау = 0 
Given nine points on this surface, it may be possible to determine its equation. 


(a) Show that if the nine points (x,, y,) fori = 1, 2, 2,..., 9 Пе on this surface, and if they determine uniquely 
the equation of this surface, then its equation can be written in determinant form as 


ао 19 
х уб 25 Х&у6 X626 J626 X6 YE 26 1 


2: .9- 2 
X; Y7 29 X77 X327; У727 X7 у] 27 1 


С 
хе Yg Za хуз Х%7% yezg xg yg 28 1 


2 
хо Yo 79 X9Y9 Хого Y9 Хо уо 29 
(b) Use the result in part (a) to determine the equation of the quadric surface that passes through the points 


(1, 2, 3), (2, 1, 7), (0,4, 6), (3, — 1,4), (3, 0, 1), ( — 1, 5, 8), (9, — 8, 3), (4, 5, 3), and 
( — 2, 6, 10). 


T2. 
(a) A hyperplane in the n-dimensional Euclidean space R" has an equation of the form 
41х] + 42х27 + а3х3 + ‘ауу ар = 0 
where aj, i = 1, 2, 3,..., и + 1, are constants, not all zero, and х;,і = 1, 2, 3, * * * , м, are variables for 
which 
(x1, X2, X3, .., xy) eR" 
A point 


(x10, X20. X30. --» X40) ER” 
lies on this hyperplane if 
21х10 + 22х20 + 23х30 t * * * Fann Fän = 0 
Given that the n points (х 1;, X2j, X3j, -- Xy). i — 1, 2, 3, .. 7, lie on this hyperplane and that they 
uniquely determine the equation of the hyperplane, show that the equation of the hyperplane can be written 
in determinant form as 


х\ х2 Хз 0*9 Xy 1 
Xj X21 X31 б Xw 1 
Xi x2 333 777 Xm lo 
X13 X23 X43. *5 Xw 1 
Xin Xn X34 ^70 Xm 1 


(b) Determine the equation of the hyperplane in 27 that goes through the following nine points: 


(1,2,3,4,5,6, 7 8, 9). (2, 5,4, 5,6,7, 8, 9. Т) 
(3.45.5 58,9 L2) (4,5,6,7,8,9,1,2,3) 
(5, 6, 7,8, 9, 1, 2, 5, 4) (6,7, 8,9, 1, 2, 5, 4, 5) 
(7,8,9,1,2,3,4,5,6) (8,9,1,2,3,4,5,6,7) 
(9, 1, 2, 5, 4, 5, 6, 7, 8) 
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10.2 Geometric Linear Programming 


In this section we describe a geometric technique for maximizing or minimizing a linear expression in two 
variables subject to a set of linear constraints. 


Prerequisites 


Linear Systems 


Linear Inequalities 


Linear Programming 


The study of linear programming theory has expanded greatly since the pioneering work of George Dantzig in 
the late 1940s. Today, linear programming is applied to a wide variety of problems in industry and science. In 
this section we present a geometric approach to the solution of simple linear programming problems. Let us 
begin with some examples. 


EXAMPLE 1 Maximizing Sales Revenue + 


A candy manufacturer has 130 pounds of chocolate-covered cherries and 170 pounds of 
chocolate-covered mints in stock. He decides to sell them in the form of two different mixtures. 
One mixture will contain half cherries and half mints by weight and will sell for $2.00 per 
pound. The other mixture will contain one-third cherries and two-thirds mints by weight and 
will sell for $1.25 per pound. How many pounds of each mixture should the candy 
manufacturer prepare in order to maximize his sales revenue? 


Mathematical Formulation Let the mixture of half cherries and half mints be called mix A, 
and let х у be the number of pounds of this mixture to be prepared. Let the mixture of one-third 
cherries and two-thirds mints be called mix В, апа let х2 be the number of pounds of this 
mixture to be prepared. Since mix А sells for $2.00 per pound and mix В sells for $1.25 per 
pound, the total sales z (in dollars) will be 


z= 2.00x, + 1.25х2 


Since each pound of mix А contains i pound of cherries and each pound of mix В contains i 
pound of cherries, the total number of pounds of cherries used in both mixtures is 

1 t 

2 x14 3 х2 
Similarly, since each pound of mix А contains i pound of mints and each pound of mix B 


2 


contains 3 pound of mints, the total number of pounds of mints used in both mixtures is 


ly + 2х› 


2 3 


Because the manufacturer can use at most 130 pounds of cherries and 170 pounds of mints, we 
must have 


lx ug la 
5*1 + 312 : 130 
1 i 2 e 
5*1 | 372 : 170 


Furthermore, since X and X2 cannot be negative numbers, we must have 

x1>0 and x50 
The problem can therefore be formulated mathematically as follows: Find values of x; and х2 
that maximize 


z = 2.00x, + 1.25x2 


subject to 
lulu «130 
pts 
in | 2х3 < 170 
xy 20 
x2 20 


Later in this section we will show how to solve this type of mathematical problem 
geometrically. 


EXAMPLE 2 Maximizing Annual Yield + 


A woman has up to $10,000 to invest. Her broker suggests investing in two bonds, А and B. 
Bond А is a rather risky bond with an annual yield of 10%, and bond В 15 a rather safe bond 
with an annual yield of 7%. After some consideration, she decides to invest at most $6000 in 
bond A, to invest at least $2000 in bond В, and to invest at least as much in bond А as in bond 
B. How should she invest her money in order to maximize her annual yield? 


Mathematical Formulation Let X be the number of dollars to be invested in bond А, and 
let X2 be the number of dollars to be invested in bond B. Since each dollar invested in bond A 
earns $.10 per year and each dollar invested in bond В earns $.07 per year, the total dollar 
amount z earned each year by both bonds is 

z= .10x,-.07x3 


The constraints imposed can be formulated mathematically as follows: 


Invest no more than $ 10,000: xj x3 < 10, 000 
Invest at most $ 6000 in bond А: x, x 6000 
Invest atleast $ 2000 in bond 8: хэ 22000 
Invest at least as much in bond A as in bond 5: хі 2X2 


We also have the implicit assumption that X1 and X2 are nonnegative: 


x1>0 and x2>0 
Thus the complete mathematical formulation of the problem is as follows: Find values of X1 
and x3 that maximize 
z=.10x, + .07х2 
subject to 
х\+х2 < 10, 000 


x, < 6000 

х2 > 2000 
х= х2 > 

x, = 

хэ m0 


EXAMPLE 3 Minimizing Cost + 


A student desires to design a breakfast of cornflakes and milk that is as economical as possible. 
On the basis of what he eats during his other meals, he decides that his breakfast should supply 


him with at least 9 grams of protein, at least i the recommended daily allowance (RDA) of 


vitamin D, and at least H the RDA of calcium. He finds the following nutrition and cost 


information on the milk and cornflakes containers: 


Milk 
(cup) (1 ounce) 
СЕЕ 


In order not to have his mixture too soggy or too dry, the student decides to limit himself to 
mixtures that contain 1 to 3 ounces of cornflakes per cup of milk, inclusive. What quantities of 
milk and cornflakes should he use to minimize the cost of his breakfast? 


Mathematical Formulation Let Х| be the quantity of milk used (measured in i-cup units), 


and let X2 be the quantity of cornflakes used (measured in 1-ounce units). Then ifz is the cost 
of the breakfast in cents, we may write the following. 


Cost of breakfast: 2= 1.5х1 + 5.0x2 


At least 9 grams protein: 4x, + 2х2 2 9 

1 tamin D: € n 
Atleast = RDA vitamin D: gilt 19/223 
At least i RDA calcium: in 2 i 


Atleast 1 ounce cornflakes 
12.1 5 (or x1 — 2x2 <0) 


per cup (wo i Е cups Jof milk: Х1 
At most 3 ounces cornflakes 3 
х2 < = > 
per cup (wo i EE cups Jof milk: x172 20! 3х1 — 2x220) 
As before, we also have the implicit assumption that x; > 0 and хэ > 0. Thus the complete 
mathematical formulation of the problem is as follows: Find values of x; and x2 that minimize 


z= 7.5x1 + 5.0x2 


subject to 

4x; + 2х2 29 

i d. i 

grit ig%2 23 

l, xd 

6M 24 

xi—2x; <0 

3x;—2x3 >0 

x; 20 

хэ 20 


Geometric Solution of Linear Programming Problems 


Each of the preceding three examples is a special case of the following problem. 


Problem 


Find values of X | and X2 that either maximize or minimize 
z= CX 2X2 (1) 


subject to 


архі + aix? (<)(>)(=) 1 
адхі + agx2 (<)(>)(=) 42 Q) 
amizi + анхо (<)(>)(=) bm 


апа 


х|>0, x220 (3) 


In each of ће m conditions of 2, any one of the symbols =, >, and — may be used. 


The problem above is called the general linear programming problem in two variables. The linear function z 
in 1 is called the objective function. Equations 2 and 3 are called the constraints; in particular, the equations 
in 3 are called the nonnegativity constraints on the variables X, and х2. 


We will now show how to solve a linear programming problem in two variables graphically. A pair of values 
(x1, x2) that satisfy all of the constraints is called a feasible solution. The set of all feasible solutions 
determines a subset of the x (x2-plane called the feasible region. Our desire is to find a feasible solution that 
maximizes the objective function. Such a solution is called an optimal solution. 


To examine the feasible region of a linear programming problem, let us note that each constraint of the form 
ах + aj3X2 = bj 
defines a line in the x 1X 2-plane, whereas each constraint of the form 
aixi Fax < 0; or ajx,-Faj2x22 bj 
defines a half-plane that includes its boundary line 
аху + aj3x2 = bj 
Thus the feasible region is always an intersection of finitely many lines and half-planes. For example, the four 
constraints 


1l 2. ak e 
5*1 | 372 < 130 
1l 2 «e 
5*1 | 372 < 170 
X1 > 0 
x2 20 


of Example 1 define the half-planes illustrated in parts (a), (b), (c), and (d) of Figure 10.2.1. The feasible 
region of this problem is thus the intersection of these four half-planes, which is illustrated in Figure 10.2.1е. 


(а) (5) (с) 


(180, 120) 


(0. 0) (260, 0) 


(d) (e) 
Figure 10.2.1 


It can be shown that the feasible region of a linear programming problem has a boundary consisting of a finite 
number of straight line segments. If the feasible region can be enclosed in a sufficiently large circle, it is 
called bounded (Figure 10.2.16); otherwise, it is called unbounded (see Figure 10.2.5). If the feasible region 
is empty (contains no points), then the constraints are inconsistent and the linear programming problem has no 
solution (see Figure 10.2.6). 


Those boundary points of a feasible region that are intersections of two of the straight line boundary segments 
are called extreme points. (They are also called corner points and vertex points.) For example, in Figure 
10.2.1e, we see that the feasible region of Example 1 has four extreme points: 


(0,0), (0,255), (180,120), (260,0) (4) 


The importance of the extreme points of a feasible region is shown by the following theorem. 


THEOREM 10.2.1 Maximum and Minimum Values 


If the feasible region of a linear programming problem is nonempty and bounded, then the objective 
function attains both a maximum and a minimum value, and these occur at extreme points of the 
feasible region. If the feasible region is unbounded, then the objective function may or may not attain 
a maximum or minimum value; however, if it attains a maximum or minimum value, it does so at an 
extreme point. 


Figure 10.2.2 suggests the idea behind the proof of this theorem. Since the objective function 

Z —C4X + C2X2 
of a linear programming problem is a linear function of x | and X, its level curves (the curves along which z 
has constant values) are straight lines. As we move in a direction perpendicular to these level curves, the 
objective function either increases or decreases monotonically. Within a bounded feasible region, the 
maximum and minimum values of z must therefore occur at extreme points, as Figure 10.2.2 indicates. 


z minimized 
а 


Xs 


z decreasing 


| 


z increasing 


bum Level curves 


z Maximized 


Figure 10.2.2 


In the next few examples we use Theorem 10.2.1 to solve several linear programming problems and illustrate 
the variations in the nature of the solutions that may occur. 


EXAMPLE 4 Example 1 Revisited + 


Figure 10.2.1e shows that the feasible region of Example 1 is bounded. Consequently, from 
Theorem 10.2.1 the objective function 


z= 2.00x, + 1.25x2 


attains both its minimum and maximum values at extreme points. The four extreme points and 
the corresponding values of аге given in the following table. 


Extreme Point Value of 
(x1. x3) z = 2.00x, + 1.25x, 


(0, 0) 0 


(0, 255) 318.75 
(180, 120) 510.00 
(260, 0) 520.00 


We see that the largest value of z is 520.00 and the corresponding optimal solution is (260, 0). 
Thus the candy manufacturer attains maximum sales of $520 when he produces 260 pounds of 
mixture А and none of mixture В. 


EXAMPLE 5 Using Theorem 10.2.1 < 


Find values of X and х2 that maximize 


z —Xx|- 3x2 


subject to 
2x|-- 2x3 = 24 
x)-x2 & 7 
x2 <= 6 
zi > 0 
x2 > 0 


Solution In Figure 10.2.3 we have drawn the feasible region of this problem. Since it is 
bounded, the maximum value of z is attained at one of the five extreme points. The values of 
the objective function at the five extreme points are given in the following table. 


(0, 0) (7, 0) 


Figure 10.2.3 


Extreme Point Value of 
(х. x3) z=x + 3х 


18 


21 


15 


0 


From this table, the maximum value of z is 21, which is attained at x, = 3 and x3 = 6. 


EXAMPLE 6 Using Theorem 10.21 < 


Find values оЁх and х2 that maximize 
z= 4x, + 6x3 
subject to 
2x|--3x3 = 24 


xy—x2 = 7 
x2 <= 6 
х > O 
хэ > 0 


Solution The constraints in this problem are identical to the constraints in Example 5, so the 
feasible region of this problem is also given by Figure 10.2.3. The values of the objective 
function at the extreme points are given in the following table. 


Extreme Point Value of 
(xi. x3) z= 4x] + 6х; 


We see that the objective function attains a maximum value of 48 at two adjacent extreme 
points, (5, 6) and (9, 2). This shows that an optimal solution to a linear programming problem 
need not be unique. As we ask you to show in Exercise 10, if the objective function has the 
same value at two adjacent extreme points, it has the same value at all points on the straight line 
boundary segment connecting the two extreme points. Thus, in this example the maximum 


value of z is attained at all points on the straight line segment connecting the extreme points 
(3, 6) and (9, 2). 


EXAMPLE 7 The Feasible Region Is a Line Segment + 


Find values of X and X2 that minimize 


z= 2x1 = X2 
subject to 
2х} +3x2 = 12 
2x1 = 3x2 > 
х\ > 0 
x2 > 0 


Solution In Figure 10.2.4 we have drawn the feasible region of this problem. Because one of 
the constraints is an equality constraint, the feasible region is a straight line segment with two 
extreme points. The values of z at the two extreme points are given in the following table. 


Figure 10.2.4 


Extreme Point Value of 
(x), x3) Dz21x,-2x, 


(3, 2) é‘ 
(6, 0) 1 


The minimum value of z is thus 4 and is attained at x; = 3 and x? = 2. 


EXAMPLE 8 Using Theorem 10.2.1 «4 


Find values of x1 and X2 that maximize 


z = 2x, + 5х2 
subject to 
2х\+х2 2 8 
=4xi +xz2 = 2 
2x1 = 3x2 Z 0 
х\ > 0 
xz 20 


Solution The feasible region of this linear programming problem is illustrated in Figure 
10.2.5. Since it is unbounded, we are not assured by Theorem 10.2.1 that the objective function 
attains a maximum value. In fact, it is easily seen that since the feasible region contains points 
for which both x1 and x2 are arbitrarily large and positive, the objective function 

z= 2x, + 5x2 


can be made arbitrarily large and positive. This problem has no optimal solution. Instead, we 
say the problem has an unbounded solution. 


4хү+ху=2 


Figure 10.2.5 


EXAMPLE 9 Using Theorem 10.2.1 < 


Find values of X and X2 that maximize 
z= —5x| 4x3 


subject to 
2x) x2 > 8 
—4xj;+x%2 = 2 
2x1 = 3x2 = 0 
х\ > 0 
x2 > 0 


Solution The above constraints are the same as those in Example 8, so the feasible region of 
this problem is also given by Figure 10.2.5. In Exercise 11 we ask you to show that the 
objective function of this problem attains a maximum within the feasible region. By Theorem 
10.2.1, this maximum must be attained at an extreme point. The values of z at the two extreme 
points of the feasible region are given in the following table. 


Extreme Point Value of 
(x1. х) z= -5x +x; 


(1, 6) l 
(3, 2) -13 


The maximum value of 2 is thus 1 and is attained at the extreme point x; = 1, x3 = 6. 


EXAMPLE 10 Inconsistent Constraints + 


Find values of X and х2 that minimize 


z= 3x, = 8x3 
subject to 
2ху— x; = 4 
3xi--llix2 < 33 
3x1 + 4x3 > 24 
х > O 
x2 > 0 


Solution As can be seen from Figure 10.2.6, the intersection of the five half-planes defined 
by the five constraints is empty. This linear programming problem has no feasible solutions 
since the constraints are inconsistent. 


3x, + Аху= 24 


Figure 10.2.6 There are no points common to all five shaded half-planes. 


Exercise Set 10.2 


1. Find values of x4 and X2 that maximize 


z= 3x1 + 2х2 
subject to 
2х} +3x2 <= 6 
2х\— хә > 0 
х\ < 2 
х2 <= 1 
х\ > 0 
хэ > 0 
Answer: 


22 


xi = 2, x)= 2; maximum value of z = ES 


2. Find values of X | and X2 that minimize 


z= 3x; = 5х3 
subject to 
2хр= хә = —2 
4хр= хә > 0 
хэ Z 3 
x, = 0 
хз > 0 
Answer: 


No feasible solutions 
3. Find values of x; and х2 that minimize 


z= = 3x1 + 2x3 


subject to 
Зхр= хә > —5 
=x; +x > 1 
2х] +4x2 > 12 
x > 0 
х2 > 0 
Answer: 


Unbounded solution 


4. Solve the linear programming problem posed in Example 2. 
Answer: 


Invest $6000 in bond A and $4000 in bond B; the annual yield is $880. 


5. Solve the linear programming problem posed in Example 3. 


Answer: 
$ cup of milk, 2 ounces of corn flakes; minimum cost — 25. = 18.68 


6. In Example 5 the constraint x; — x3 < 7 is said to be nonbinding because it can be removed from the 
problem without affecting the solution. Likewise, the constraint х2 « 6 is said to be binding because 
removing it will change the solution. 

(a) Which of the remaining constraints are nonbinding and which are binding? 
(b) For what values of the right-hand side of the nonbinding constraint x; — x2 < 7 will this constraint 
become binding? For what values will the resulting feasible set be empty? 


(c) For what values of the right-hand side of the binding constraints х2 < 6 will this constraint become 
nonbinding? For what values will the resulting feasible set be empty? 


10. 


п. 


Answer: 


(a) xj 0 and x5 > 0 are nonbinding; 2x; + 3x2 < 24 is binding 


(b) x1 —x2 € v for y = — 3 15 binding and for y = — 6 yields the empty set. 
(c) x3 € v for y = 8 is nonbinding and for у <= 0 yields the empty set. 


. A trucking firm ships the containers of two companies, А and B. Each container from company A weighs 


40 pounds and is 2 cubic feet in volume. Each container from company B weighs 50 pounds and is 3 cubic 
feet in volume. The trucking firm charges company А $2.20 for each container shipped and charges 
company В $3.00 for each container shipped. If one of the firm's trucks cannot carry more than 37,000 
pounds and cannot hold more than 2000 cubic feet, how many containers from companies А and В should 
a truck carry to maximize the shipping charges? 


Answer: 


550 containers from company A and 300 containers from company B; maximum shipping 


charges — $2110 


. Repeat Exercise 7 if the trucking firm raises its price for shipping a container from company А to $2.50. 


Answer: 


925 containers from company A and no containers from company B; maximum shipping 


charges — $2312.50 


. A manufacturer produces sacks of chicken feed from two ingredients, А and B. Each sack is to contain at 


least 10 ounces of nutrient АЛ, at least 8 ounces of nutrient А2, and at least 12 ounces of nutrient A3. 
Each pound of ingredient A contains 2 ounces of nutrient A}, 2 ounces of nutrient Лз, and 6 ounces of 
nutrient Аз. Each pound of ingredient B contains 5 ounces of nutrient АЛ, 3 ounces of nutrient V5, and 4 
ounces of nutrient Аз. If ingredient A costs 8 cents per pound and ingredient B costs 9 cents per pound, 
how much of each ingredient should the manufacturer use in each sack of feed to minimize his costs? 


Answer: 


0.4 pound of ingredient A and 2.4 pounds of ingredient B; minimum cost = 24.8% 


If the objective function of a linear programming problem has the same value at two adjacent extreme 
points, show that it has the same value at all points on the straight line segment connecting the two 
extreme points. [Hint: If (xi , x4) and (xi, x) are any two points in the plane, a point (хү, x2) lies on 
the straight line segment connecting them if 

Х| = ix} F (1 -fx[ 
and 

х2= £x^ Р (1 = t)z 
where ¢ is a number in the interval [0, 1].] 


Show that the objective function in Example 9 attains a maximum value in the feasible set. [Hint: 
Examine the level curves of the objective function.] 


Section 10.2 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the 
relevant documentation for the particular utility you are using. The goal of these exercises is to provide you 
with a basic proficiency with your technology utility. Once you have mastered the techniques in these 
exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


ТІ. Consider the feasible region consisting of 0 < х, 0 < y along with the set of inequalities 


(2k + 1)т (Ок т\т. 
x cos ETE Fy sin NEN" NEN < cos( | 


for k= 0, 1, 2,.., à — 1. Maximize the objective function 
g —3x-F4y 


assuming that (a) x = 1, (b) и = 2, (с) я = 3, (d) и = 4, (e) u = 5, © я = 6, (8) ми = 7, h) и = 8, (0) и — 9, 
(0) 4 = 10, and (k) » = 11. (I) Next, maximize this objective function using the nonlinear feasible region, 


0 «& x, 0 < y, and 

z +y <1 
(m) Let the results of parts (a) through (k) begin a sequence of values for Zmax. Do these values approach the 
value determined in part (1)? Explain. 


T2. Repeat Exercise T1 using the objective function z = x + y. 
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10.3 The Earliest Applications of Linear Algebra 


Linear systems can be found in the earliest writings of many ancient civilizations. In this section we give 
some examples of the types of problems that they used to solve. 


Prerequisites 


Linear Systems 


The practical problems of early civilizations included the measurement of land, the distribution of goods, the 
tracking of resources such as wheat and cattle, and taxation and inheritance calculations. In many cases, these 
problems led to linear systems of equations since linearity is one of the simplest relationships that can exist 
among variables. In this section we present examples from five diverse ancient cultures illustrating how they 
used and solved systems of linear equations. We restrict ourselves to examples before A.D. 500. These 
examples consequently predate the development of the field of algebra by Islamic/Arab mathematicians, a 
field that ultimately led in the nineteenth century to the branch of mathematics now called linear algebra. 


EXAMPLE 1 Egypt (about 1650 вс) < 


Problem 40 of the Ahmes Р 


m 


apyrus 


The Ahmes (or Rhind) Papyrus is the source of most of our information about ancient Egyptian 
mathematics. This 5-meter-long papyrus contains 84 short mathematical problems, together 
with their solutions, and dates from about 1650 B.C. Problem 40 in this papyrus is the following: 


Divide 100 hekats of barley among five men in arithmetic progression so that the sum of 
the two smallest is one-seventh the sum of the three largest. 


Let a be the least amount that any man obtains, and let d be the common difference of the terms 
in the arithmetic progression. Then the other four men receive g + d, a + 2d. а + 3d, and 
a + 4d hekats. The two conditions of the problem require that 


a + (a +4) + (a + 2d) + (a + 3d) + (a +44) 
TIa +24) + (a + 34) + (a + 44)] = a--(a--d) 


100 


These equations reduce to the following system of two equations in two unknowns: 


5а+104 = 100 


1 
11а— 2d = 0 (0 


The solution technique described in the papyrus is known as ће method of false position or 
false assumption. It begins by assuming some convenient value of a (in our case g — 1), 
substituting that value into the second equation, and obtaining g = 11 / 2. Substituting g = 1 
and g = 11 / 2 into the left-hand side of the first equation gives 60, whereas the right-hand side 
is 100. Adjusting the initial guess for a by multiplying it by 100 / 60 leads to the correct value 
а = 5 { 3. Substituting g = 5 / 3 into the second equation then gives 7 = 55 / 6, so the 
quantities of barley received by the five men аге 10 / 6, 65 / 6, 120/6, 175 / 6, and 230/ 6 
hekats. This technique of guessing a value of an unknown and later adjusting it has been used 
by many cultures throughout the ages. 


EXAMPLE 2 Babylonia (1900-1600 вс) «4 


The Old Babylonian Empire flourished in Mesopotamia between 1900 and 1600 B.c. Many clay 
tablets containing mathematical tables and problems survive from that period, one of which 
(designated Ca MLA 1950) contains the next problem. The statement of the problem is a bit 
muddled because of the condition of the tablet, but the diagram and the solution on the tablet 
indicate that the problem is as follows: 


йш сир ANE C, CNN 


poe 


A trapezoid with an area of 320 square units is cut off from a right triangle by a line 
parallel to one of its sides. The other side has length 50 units, and the height of the 
trapezoid is 20 units. What are the upper and the lower widths of the trapezoid? 


Let x be the lower width of the trapezoid and y its upper width. The area of the trapezoid is its 
height times its average width, so 20 | х ; » 


30 = T The solution on the tablet uses these relations to generate the linear system 


= 320. Using similar triangles, we also have 


iG by) 16 


| Q) 
56-0) = 4 


Adding and subtracting these two equations then gives the solution x — 20 and y — 12. 


EXAMPLE 3 China (AD. 263) + 


Chiu Chang Suan Shu in Chinese characters 


The most important treatise in the history of Chinese mathematics is the Chiu Chang Suan Shu, 
or *The Nine Chapters of the Mathematical Art." This treatise, which is a collection of 246 
problems and their solutions, was assembled in its final form by Liu Hui in A.D. 263. Its 
contents, however, go back to at least the beginning of the Han dynasty in the second century 
B.C. The eighth of its nine chapters, entitled “The Way of Calculating by Arrays,” contains 18 
word problems that lead to linear systems in three to six unknowns. The general solution 
procedure described is almost identical to the Gaussian elimination technique developed in 


Europe in the nineteenth century by Carl Friedrich Gauss. The first problem in the eighth 
chapter is the following: 


There are three classes of corn, of which three bundles of the first class, two of the 
second, and one of the third make 39 measures. Two of the first, three of the second, and 
one of the third make 34 measures. And one of the first, two of the second, and three of 
the third make 26 measures. How many measures of grain are contained in one bundle 
of each class? 


Let x, y, and z be the measures of the first, second, and third classes of corn. Then the 
conditions of the problem lead to the following linear system of three equations in three 
unknowns: 


3x+2y+z = 39 
2х + 3у + 2 = 34 (3) 
х+ 22у + 32 = 26 


The solution described in the treatise represented the coefficients of each equation by an 
appropriate number of rods placed within squares on a counting table. Positive coefficients 
were represented by black rods, negative coefficients were represented by red rods, and the 
squares corresponding to zero coefficients were left empty. The counting table was laid out as 
follows so that the coefficients of each equation appear in columns with the first equation in the 
rightmost column: 


Next, the numbers of rods within the squares were adjusted to accomplish the following two 
steps: (1) two times the numbers of the third column were subtracted from three times the 
numbers in the second column and (2) the numbers in the third column were subtracted from 
three times the numbers in the first column. The result was the following array: 


In this array, four times the numbers in the second column were subtracted from five times the 
numbers in the first column, yielding 


This last array is equivalent to the linear system 


3x+2y+z = 39 
5у +2 = 24 
362 = 99 


This triangular system was solved by a method equivalent to back substitution to obtain 
x= 37/4, у = 17/4, andz= 11/4. 


EXAMPLE 4 Greece (third century B.C.) < 


Archimedes c. 287—212 B.C. 


Perhaps the most famous system of linear equations from antiquity is the one associated with 
the first part of Archimedes' celebrated Cattle Problem. This problem supposedly was posed by 
Archimedes as a challenge to his colleague Eratosthenes. No solution has come down to us 
from ancient times, so that it is not known how, or even whether, either of these two geometers 
solved it. 


If thou art diligent and wise, O stranger, compute the number of cattle of the Sun, who 
once upon a time grazed on the fields of the Thrinacian isle of Sicily, divided into four 
herds of different colors, one milk white, another glossy black, a third yellow, and the 
last dappled. In each herd were bulls, mighty in number according to these proportions: 
Understand, stranger, that the white bulls were equal to a half and a third of the black 
together with the whole of the yellow, while the black were equal to the fourth part of 
the dappled and a fifth, together with, once more, the whole of the yellow. Observe 
further that the remaining bulls, the dappled, were equal to a sixth part of the white and 
a seventh, together with all of the yellow. These were the proportions of the cows: The 
white were precisely equal to the third part and a fourth of the whole herd of the black; 
while the black were equal to the fourth part once more of the dappled and with it a 


fifth part, when all, including the bulls, went to pasture together. Now the dappled їп 
four parts were equal in number to a fifth part and a sixth of the yellow herd. Finally 
the yellow were in number equal to a sixth part and a seventh of the white herd. If thou 
canst accurately tell, O stranger, the number of cattle of the Sun, giving separately the 
number of well-fed bulls and again the number of females according to each color, thou 
wouldst not be called unskilled or ignorant of numbers, but not yet shalt thou be 
numbered among the wise. 


The conventional designation of the eight variables in this problem is 


Rw et DU no X 


= number of white bulls 

= number of black bulls 

— number of yellow bulls 
— number of dappled bulls 
number of white cows 
= number of black cows 
= number of yellow cows 


= number of dappled cows 


The problem can now be stated as the following seven homogeneous equations in eight 


unknowns: 


о-н: 
б-т» 
* в={1+1р+® 
6. а= (14 ie» 
1 y= (Le 3jrew 


(The white bulls were equal to a half and a third of the 
black [bulls] together with the whole of the yellow 
[bulls].) 


(The black [bulls] were equal to the fourth part of the 
dappled [bulls] and a fifth, together with, once more, the 
whole of the yellow [bulls].) 


(The remaining bulls, the dappled, were equal to a sixth 
part of the white [bulls] and a seventh, together with all 
of the yellow [bulls].) 


(The white [cows] were precisely equal to the third part 
and a fourth of the whole herd of the black.) 


(The black [cows] were equal to the fourth part once 
more of the dappled and with it a fifth part, when all, 
including the bulls, went to pasture together.) 


(The dappled [cows] in four parts [that is, in totality] 
were equal in number to a fifth part and a sixth of the 
yellow herd.) 


(The yellow [cows] were in number equal to a sixth part 
and a seventh of the white herd.) 


As we ask you to show in the exercises, this system has infinitely many solutions of the form 


= 10, 366, 482k 
=  7,460,514k 
= 4, 149, 387k 
7, 358, 060k 
= 7, 206, 360k 
= 4,893, 246k 
= 5,439, 213k 
= 3,515, 820k 


where k is any real number. The values & = 1, 2, ... give infinitely many positive integer 
solutions to the problem, with x = 1 giving the smallest solution. 


(4) 
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EXAMPLE 5 India (fourth century AD.) << 


Fragment III-5-3v of the Bakhshali Manuscript 


The Bakhshali Manuscript is an ancient work of Indian/Hindu mathematics dating from around 
the fourth century A.D., although some of its materials undoubtedly come from many centuries 
before. It consists of about 70 leaves or sheets of birch bark containing mathematical problems 
and their solutions. Many of its problems are so-called equalization problems that lead to 
systems of linear equations. One such problem on the fragment shown is the following: 


One merchant has seven asava horses, a second has nine haya horses, and a third has 
ten camels. They are equally well off in the value of their animals if each gives two 
animals, one to each of the others. Find the price of each animal and the total value of 
the animals possessed by each merchant. 


Let x be the price of an asava horse, let y be the price of a haya horse, let z be the price of a 
camel, and the let K be the total value of the animals possessed by each merchant. Then the 
conditions of the problem lead to the following system of equations: 


ХФ уФ2 = К 
х+7у+2 = К (5) 
x+ y+% = К 


The method of solution described in the manuscript begins by subtracting the quantity 


(x ++ у +z) from both sides of the three equations to obtain 4x = бу = 7z = К — (x + y +2) 
. This shows that if the prices x, y, and z are to be integers, then the quantity & — (x + y +62) 
must be an integer that is divisible by 4, 6, and 7. The manuscript takes the product of these 
three numbers, or 168, for the value of & — (x + y +z), which yields x = 42, y = 28, and 

z = 24 for the prices and ¥ — 262 for the total value. (See Exercise 6 for more solutions to this 
problem.) 


Exercise Set 10.3 


1. The following lines from Book 12 of Homer's Odyssey relate a precursor of Archimedes' Cattle Problem: 


Thou shalt ascend the isle triangular, 
Where many oxen of the Sun are fed, 
And fatted flocks. Of oxen fifty head 
In every herd feed, and their herds are seven; 


And of his fat flocks is their number even. 


The last line means that there are as many sheep in all the flocks as there are oxen in all the herds. What is 
the total number of oxen and sheep that belong to the god of the Sun? (This was a difficult problem in 
Homer's day.) 


Answer: 


700 
2. Solve the following problems from the Bakhshali Manuscript. 


(a) B possesses two times as much as A; C has three times as much as A and B together; D has four times 
as much as A, B, and C together. Their total possessions are 300. What is the possession of A? 


(b) B gives 2 times as much as A; C gives 3 times as much as B; D gives 4 times as much as C. Their total 
gift is 132. What is the gift of A? 


Answer: 


(a) 5 
(b) 4 
3. A problem on a Babylonian tablet requires finding the length and width of a rectangle given that the length 


and the width add up to 10, while the length and one-fourth of the width add up to 7. The solution 
provided on the tablet consists of the following four statements: 


Multiply 7 by 4 to obtain 28. 
Take away 10 from 28 to obtain 18. 
Take one-third of 18 to obtain 6, the length. 


Take away 6 from 10 to obtain 4, the width. 


Explain how these steps lead to the answer. 


. The following two problems are from “The Nine Chapters of the Mathematical Art.” Solve them using the 
array technique described in Example 3. 


(a) Five oxen and two sheep are worth 10 units and two oxen and five sheep are worth 8 units. What is the 
value of each ox and sheep? 


(b) There are three kinds of corn. The grains contained in two, three, and four bundles, respectively, of 
these three classes of corn, are not sufficient to make a whole measure. However, if we added to them 
one bundle of the second, third, and first classes, respectively, then the grains would become on full 
measure in each case. How many measures of grain does each bundle of the different classes contain? 


Answer: 


(а) Ox, = units; sheep, 20 unit 


(b) First kind, 5 measure; second kind, 55 measure; third kind, a measure 


. This problem in part (a) is known as the “Flower of Thymaridas,” named after a Pythagorean of the fourth 
century B.C. 


(a) Given the n numbers a1, 42, ..., @y, solve for x1, X2, ..., Ху in the following linear system: 


ХФ х2 o Xn = d, 
xi PX? = а 
х\+хз = аз 
Xi Хы = у 


(b) Identify a problem in this exercise set that fits the pattern in part (a), and solve it using your general 
solution. 


Answer: 


(a) "$E xy =A; =x], į = 2, Эхо E 
1 1 1 


(b) Exercise 7(b); gold, 305 minae; brass, 91 minae; tin, 14— minae; iron, 5— minae 


. For Example 5 from the Bakhshali Manuscript: 


(a) Express Equations 5 as a homogeneous linear system of three equations in four unknowns (x, y, z, and 
K) and show that the solution set has one arbitrary parameter. 


(b) Find the smallest solution for which all four variables are positive integers. 


(c) Show that the solution given in Example 5 is included among your solutions. 


Answer: 
(а) 5х +у +2-К = 0 
x+7?y+z—-K = 0 
x+y+8z=-K 0 
х= 26, = Е — 121 р; where ris an arbitrary number 


1317 it 131 
(b) Take = 131, so that x = 21, у = 14,z = 12, К = 131. 
(c) Take = 262, so that x = 42, у = 28, z = 24, K = 262. 


7. Solve the problems posed in the following three epigrams, which appear in a collection entitled “The 
Greek Anthology,” compiled in part by a scholar named Metrodorus around A.D. 500. Some of its 46 
mathematical problems are believed to date as far back as 600 B.c. [Note: Before solving parts (a) and (c), 
you will have to formulate the question. ] 


(a) I desire my two sons to receive the thousand staters of which I am possessed, but let the fifth part of 
the legitimate one's share exceed by ten the fourth part of what falls to the illegitimate one. 


(b) Make me a crown weighing sixty minae, mixing gold and brass, and with them tin and much-wrought 
iron. Let the gold and brass together form two-thirds, the gold and tin together three-fourths, and the 
gold and iron three-fifths. Tell me how much gold you must put in, how much brass, how much tin, 
and how much iron, so as to make the whole crown weigh sixty minae. 


(c) First person: I have what the second has and the third of what the third has. Second person: I have 
what the third has and the third of what the first has. Third person: And I have ten minae and the third 
of what the second has. 


Answer: 


(a) Legitimate son, 5775 staters; illegitimate son, 4222 staters 


(b) Gold, 302 minae; brass, 92 minae; tin, 145 minae; iron, 52 minae 


(C) First person, 45; second person, 3735 third person, 221 


Section 10.3 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the 
relevant documentation for the particular utility you are using. The goal of these exercises is to provide you 
with a basic proficiency with your technology utility. Once you have mastered the techniques in these 
exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


T1. 

(a) Solve Archimedes' Cattle Problem using a symbolic algebra program. 

(b) The Cattle Problem has a second part in which two additional conditions are imposed. The first of these 
states that “When the white bulls mingled their number with the black, they stood firm, equal in depth and 
breadth." This requires that у -+ д be a square number, that 15, 1, 4, 9, 16, 25, and so on. Show that this 
requires that the values of k in Eq. 4 be restricted as follows: 

k=4,456,749r*, r—1,2,3,... 


and find the smallest total number of cattle that satisfies this second condition. 


Remark The second condition imposed in the second part of the Cattle Problem states that “When the 
yellow and the dappled bulls were gathered into one herd, they stood in such a manner that their number, 
beginning from one, grew slowly greater ’til it completed a triangular figure." This requires that the quantity 
Y+ D be a triangular number—that is, a number of the form 1, 1 + 2, 1--2-- 3,1 +2 + 3 +4, .... This 
final part of the problem was not completely solved until 1965 when all 206,545 digits of the smallest 
number of cattle that satisfies this condition were found using a computer. 


T2. The following problem is from “The Nine Chapters of the Mathematical Art" and determines a 
homogeneous linear system of five equations in six unknowns. Show that the system has infinitely many 
solutions, and find the one for which the depth of the well and the lengths of the five ropes are the smallest 
possible positive integers. 


Suppose that five families share a well. Suppose further that 

2 of A's ropes are short of the well's depth by one of B's ropes. 
3 of B's ropes are short of the well's depth by one of C's ropes. 
4 of C's ropes are short of the well's depth by one of D's ropes. 
5 of D's ropes are short of the well's depth by one of E's ropes. 
6 of E's ropes are short of the well's depth by one of A's ropes. 
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10.4 Cubic Spline Interpolation 


In this section an artist's drafting aid is used as a physical model for the mathematical problem of finding a curve that passes 
through specified points in the plane. The parameters of the curve are determined by solving a linear system of equations. 


Prerequisites 


Linear Systems 
Matrix Algebra 


Differential Calculus 


Curve Fitting 


Fitting a curve through specified points in the plane is a common problem encountered in analyzing experimental data, in 
ascertaining the relations among variables, and in design work. А ubiquitous application is in the design and description of 
computer and printer fonts, such as PostScript™ and TrueType™ fonts (Figure 10.4.1). In Figure 10.4.2 seven points in the 
xy-plane are displayed, and in Figure 10.4.4 a smooth curve has been drawn that passes through them. A curve that passes 
through a set of points in the plane is said to interpolate those points, and the curve is called an interpolating curve for those 
points. The interpolating curve in Figure 10.4.4 was drawn with the aid of a drafting spline (Figure 10.4.3). This drafting aid 
consists of a thin, flexible strip of wood or other material that is bent to pass through the points to be interpolated. Attached 
sliding weights hold the spline in position while the artist draws the interpolating curve. The drafting spline will serve as the 
physical model for a mathematical theory of interpolation that we will discuss in this section. 


Figure 10.4.1 


Figure 10.4.2 


Figure 10.4.3 


Figure 10.4.4 


Statement of the Problem 


Suppose that we are given n points in the xy-plane, 
(x1, X1). G2, 92). Ans Yn) 
which we wish to interpolate with a “well-behaved” curve (Figure 10.4.5). For convenience, we take the points to be equally 
spaced in the x-direction, although our results can easily be extended to the case of unequally spaced points. If we let the 
common distance between the x-coordinates of the points be Л, then we have 
ХХ] =X%3 =X = "t t = Xy = Ху] =й 
Let y = S(x) X1 EX € Xy denote the interpolating curve that we seek. We assume that this curve describes the displacement of 


a drafting spline that interpolates the n points when the weights holding down the spline are situated precisely at the n points. It 
is known from linear beam theory that for small displacements, the fourth derivative of the displacement of a beam is zero along 
any interval of the x-axis that contains no external forces acting on the beam. If we treat our drafting spline as a thin beam and 
realize that the only external forces acting on it arise from the weights at the n specified points, then it follows that 


Sax) =0 (1) 


for values of x lying in the у — | open intervals 
(x1, x2). (x2, x3), san (Xy-1, Xp) 


between the n points. 


Figure 10.4.5 


We also need the result from linear beam theory that states that for a beam acted upon only by external forces, the displacement 
must have two continuous derivatives. In the case of the interpolating curve у = S(x) constructed by the drafting spline, this 
means that S(x), S" (x), and S" (x) must be continuous for X] EX € Xy. 


The condition that S" (х) be continuous is what causes a drafting spline to produce a pleasing curve, as it results in continuous 


curvature. The eye can perceive sudden changes in curvature—that is, discontinuities in S" (x)—but sudden changes in higher 
derivatives are not discernible. Thus, the condition that S" (x) be continuous is the minimal prerequisite for the interpolating 
curve to be perceptible as a single smooth curve, rather than as a series of separate curves pieced together. 


To determine the mathematical form of the function S(x), we observe that because 5) = ( in the intervals between the n 
specified points, it follows by integrating this equation four times that S(x) must be a cubic polynomial in x in each such 
interval. In general, however, S(x) will be a different cubic polynomial in each interval, so S(x) must have the form 


10), x|ExzEx2 
S(x) = 81, X2 €x €x А 
Su-100. Xn-1 € X <ху 
where 51 (x), S2(x), ..., S41 (x) are cubic polynomials. For convenience, we will write these in the form 
Sx) = a-z) bi — 21)? ex — 21) +41, ТРЕ 
$0(х) = а(х = х2)? + Бок — x3) c e3(x — 12) +42, x2€ x < х3 (3) 
5160) = аһ л(к= хи) + Op 1X m xui) en хи) +431, хи EX хи 


The 2;'5, ,'s, С;'5, апа d,'s constitute a total of 45; — 4 coefficients that we must determine to specify S(x) completely. If we 
choose these coefficients so that S'(x) interpolates ће n specified points in the plane and S(x), S" (x), and S" (x) are 


continuous, then the resulting interpolating curve is called a cubic spline. 


Derivation of the Formula of a Cubic Spline 


From Equations 2 and 3, we have 


S(x) = S) =ay(x — x1)? +81 (x 2x1)? +e (x x1) +01, xi Ex €x 
SE) = Sy(x) =аз(х — x2) + #2(х — ха)? e — x2) d, х2<х<хз 4) 
$(х) = Syn) Saya x — хи) + bp (c m xs) Reni хи) - ducis хи EX Xp 
SO 
5х) = 81 (к) =3a (x хр) 2b ц) $e, x £x €x) 
S'(x) = Sh (x) = Заз(х — x2)? + 2b3(x — х2) +02, x2 €x € х3 (5) 
SE) = H1) = Bape — xn-1)) + bp — an) Fen, Xn-1 Ex € xg 
and 
S"(x) = S] (х) =6ay(x x1) + 281, x; <x<xQ 
S"(x) = Si! (x) = баз(х — х2) + 222, х2<х<хз (6) 
S"(x) = SM |(х)=бал—1(х—х„—1) + 26-1, хи—1<хХ<х„ 


We will now use these equations and the four properties of cubic splines stated below to express the unknown coefficients dj, 5; 
„Су, dp i= 1, 2, ..., и — 1, in terms of the known coordinates YL Yh s Ум 


1. S(x) interpolates the points (ху, y;),i = 1, 2, .... n. 


Because S(x) interpolates the points (х;, y,), i = 1, 2, ..., з, we have 


S(x1) = у, S(x2) = У2,.-„ Sn) = Yn (7) 
From the first y — 1 of these equations and 4, we obtain 
di = yi 
dj = у2 (8) 
йу] = Yni 


From the last equation in 7, the last equation in 4, and the fact that x,, — ху = А, we obtain 
Anh? + by th? + суй duci =Yn (9) 
2. S(x) is continuous on [x1, Хуу]. 
Because S(x) is continuous for x, <x € xy, it follows that at each point x; in the set x2, x3, ..., Ху] we must have 
Sy-1 (x) =5;(х;), i-2,3,.,»—1 (10) 


Otherwise, the graphs of S;. (x) and S; (x) would not join together to form a continuous curve at X;. When we apply the 
interpolating property §;(x,;) = y; it follows from 10 that S; 1(x;) = yp i = 2, 3, -- и — 1, or from 4 that 


ah? + bi? +cik +0) = уз 
ah? + bah? + coh +da = уз (11) 
h? + ай? h-d, 3 = 
By—2 F DO, 22 Су-20 Fan- =  Yn-1 
3. S' (x) is continuous on [x], xy]. 
Because S" (x) is continuous for x1 € x < ху, it follows that 
Sy (xj) = Sx). і=2,3,..,0 = 1 
or, from 5, 
Zajk? +2bik 6] = c3 
Зазћ? 42b4h--c3 = сз (12) 
3a, 2h? 4-2b4 3h--c4-2 = су] 
4. S" (x) is continuous on [x1, х2]. 
Because S" (x) is continuous for x1 € x < ху, it follows that 
Si (xj) = (ху), = 2,3,..,0 = 1 
or, from 6, 
бай + 22р = 223 
базй + 223 = 213 (13) 


6a, 3h--2b, 3 = 2,1 


Equations 8, 9, 11, 12, and 13 constitute a system of 4»; — 6 linear equations in the 43 — 4 unknown coefficients dj, b;, Ci, dj, 
i= 1, 2, ..., м — 1. Consequently, we need two more equations to determine these coefficients uniquely. Before obtaining these 
additional equations, however, we can simplify our existing system by expressing the unknowns dj, bj, Су, and d; in terms of 


new unknown quantities 
Mi—S8"(x), Ma-—58"(x3,.., My—S" (xy) 


and the known quantities 


Yi Y2 --- Yn 
For example, from 6 it follows that 

Mi = 2b 

М = 2b3 


50 


bà - iMi, b= 


Moreover, we already know from 8 that 

di—y, d4—J2.. dap =Yn-1 
We leave it as an exercise for you to derive the expressions for the @;'s and c;'s in terms of the M ;'s and y,'s. The final result is 
as follows: 


THEOREM 10.4.1 Cubic Spline Interpolation 


Given n points (х, y1), (x2, 72), --» (Xm Yn) with x34, — xj =h, i 1, 2, ..., и — 1, the cubic spline 
арб =) (к) ei x1) di, z1 Sx <22 


S(x) = аз(х — x2)? + b3(x — х2) + ea(x — x2) + da, х2<х<хз 


3 2 a 
Gy—1(X = Xy—-1) + ®»—1(Х — Х»—1)° Hen — Xy) Fan- Xn- € X € Xp 


that interpolates these points has coefficients given by 
ар = (Mia Му)! 6h 


bj =M;/2 14) 
ci = (yia -y)!h- [Mi + 2М)Ё 1 6] 
dj —yi 


fori —1,2,.., n — 1, where M; =S" (xj), i = 1, 2,..., и. 


From this result, we see that the quantities №, Мз, ..., M p uniquely determine the cubic spline. To find these quantities, we 
substitute the expressions for @;, b;, and Cy given in 14 into 12. After some algebraic simplification, we obtain 


M,-4M3--M4 = 6(у = 2y +уз) th? 
M35--4M3-4- M4 6(уз = 2уз +уд) fh? (15) 


Му 3 +4Му 1+ My = — 6(yn-2—2yn-1- Y») th? 


or, in matrix form, 


© 
— 


© 
© 


- © 


© 


© 
e 


pæd 


© 


pð 


01| 42 у= 2y2 3 
o|| Мз J2—2Y3-J4 
0|| Ma А уз- 2у4+ 55 
0 i NU | 
n—3 Jn—4— 2¥n-3 + Yn-2 
My-2 Yn-3 — 2yn-2  Yn-1 
| My-1 Yn-2 = ZYn-1 + ¥n 


This is a linear system of y — 2 equations for the n unknowns M1, Ma, ..., M. Thus, we still need two additional equations to 
determine M ү, M 5, ..., М uniquely. The reason for this is that there are infinitely many cubic splines that interpolate the 
given points, so we simply do not have enough conditions to determine a unique cubic spline passing through the points. We 
discuss below three possible ways of specifying the two additional conditions required to obtain a unique cubic spline through 
the points. (The exercises present two more.) They are summarized in Table 1. 


Table 1 
Natural The second M,=0 4 10-000 M; Yı = 2у + уз 
Spline derivative of the М„=0 141-7000 M. y - 2y4 4 )4 
spline is zero at the бо o; TE . [2.9 img 
endpoints. n? | 
p ооо 141 [| Mao Yn-2 7 Yni + Yn 
000 0 1 4 || Mii 
The spline reduces 5 10 0 0 0 M; yi- 20 ts 
to a parabolic curve 14] 00 0 M y - 2y + Y4 
on the first and last О: mJ BEER 
intervals. h? : 
T А x | l Sp Yn-2 — £Yn-i + Уп 
0 0 ‹ 5 M, 
The spline is a M,=2M,-M, 600 00 0 М, 3-25» 
single cubic curve M,22M,,-M,5 141-000 M. ү, Е 2v M y 
on the first two and dO ыш г - 6 У›— 2У3 + Уа 
last two intervals. 000 1 4 1 M, AN 5. | 
E Yn-2 — 2Yn-1 + Yn 
00 0 = оо 6 || Ma 


The Natural Spline 


The two simplest mathematical conditions we can impose are 


1 
1 
0 


0 
0 


0 


0 
1 
4 


0 
0 


0... 0 
0... 0 
l o5 
0... 1 
0... 0 


M,=M,,=0 
These conditions together with 15 result in an » x »; linear system for M, Ma, ..., Mn, which can be written in matrix form as 

o o]| ^ 0 

0 of] M2 y1 = 2y2 +3 

0 0| Мз | 6] у2—2уз3+у4 

i i д? i 

4 l| Msi Yn-2— 2¥n-1 + Jn 

0 My 0 


For numerical calculations it is more convenient to eliminate Af, and M „ from this system and write 


4100. 00 0]| X yi-2y2y3 


1410..000| Мз y2—2y3 4 
M —2 
0000... 1 4 1/3 Jn-3 — 2Yn-2 + Jn-1 
кеп uL My Yn-2— ZYn-1 Yn 
together with 
M,=0 (17) 
M,=0 (18) 


Thus, ће (и — 2) х (и — 2) linear system can be solved for the y — 2 coefficients Ma, M3, .... My_1, and M, and M, are 
determined by 17 and 18. 


Physically, the natural spline results when the ends of a drafting spline extend freely beyond the interpolating points without 
constraint. The end portions of the spline outside the interpolating points will fall on straight line paths, causing S " (x) to 


vanish at the endpoints x1 and ху and resulting in the mathematical conditions M, = Mp = 0. 


The natural spline tends to flatten the interpolating curve at the endpoints, which may be undesirable. Of course, if it is required 
that S" (x) vanish at the endpoints, then the natural spline must be used. 


The Parabolic Runout Spline 


The two additional constraints imposed for this type of spline are 


М = М (19) 


My = My (20) 


If we use the preceding two equations to eliminate M, and M, from 15, we obtain the (м — 2) x (» — 2) linear system 


5100. оо 011 22 у= 242 +3 

1410..000|| M3 y2—2y3 X4 

0141 000! M 6 zi d 

PPG iii " mI й A x (21) 
0000... 14 1| Myo Yn-3 = 2yn-2+Yn-1 


© 
о 
о 
о 
о 
— 


My Jn-2— 2Yn-1 Jn 
for M3, M3, ..., Муу. Once these y — 2 values have been determined, № апа M „ are determined from 19 and 20. 
From 14 we see that M = Ma implies that 24 = 0, and M,, = M,,_; implies that @„_ = 0. Thus, from 3 there are no cubic 


terms in the formula for the spline over the end intervals [x1, х2] and [xy 1, хь]. Hence, as the name suggests, the parabolic 
runout spline reduces to a parabolic curve over these end intervals. 


The Cubic Runout Spline 


For this type of spline, we impose the two additional conditions 


Мү=2Мз = M3 (22) 


My =2My-1 — My-2 (23) 


Using these two equations to eliminate № у and M ,, from 15 results in the following (з — 2) x (и — 2) linear system for 
M3, M3, Uu My: 


£000 оо 011 M2 >\—2у2+у3 
1410 оо 0|| M3 J2—2y3--Y4 
M -2y4- 
ae’ ae ч = УЗ Кыш (24) 
0000..14 1| My» Jn-3 — 2Yn-2  Jn-1 
а Р Уп-12—2Уп-1+ Yn 


After we solve this linear system for M3, Мз, ..., M. |, we can use 22 and 23 to determine Л у and My. 


„ 


If we rewrite 22 as 

М»—Му=М+— Ma 
it follows from 14 that 41 = a2. Because S"" (x) = 6a, on [x1, х2] and S" (x) = баз on [x2, x3], we see that S" (x) is 
constant over the entire interval [x1, x3]. Consequently, S(x) consists of a single cubic curve over the interval [x1, x3] rather 
than two different cubic curves pieced together at x2. [To see this, integrate S" (x) three times.] A similar analysis shows that 
S(x) consists of a single cubic curve over the last two intervals. 


Whereas the natural spline tends to produce an interpolating curve that is flat at the endpoints, the cubic runout spline has the 
opposite tendency: it produces a curve with pronounced curvature at the endpoints. If neither behavior is desired, the parabolic 
runout spline is a reasonable compromise. 


EXAMPLE 1 Using a Parabolic Runout Spline — 


The density of water is well known to reach a maximum at a temperature slightly above freezing. Table 2, from 
the Handbook of Chemistry and Physics (CRC Press, 2009), gives the density of water in grams per cubic 
centimeter for five equally spaced temperatures from — 1 0c to 30°С. We will interpolate these five 
temperature-density measurements with a parabolic runout spline and attempt to find the maximum density of 
water in this range by finding the maximum value on this cubic spline. In the exercises we ask you to perform 
similar calculations using a natural spline and a cubic runout spline to interpolate the data points. 


Table 2 


Temperature (^C) | Density (g/em?) 


99815 
99987 


.99973 
.99823 
.99567 


Set 


хі= -10, y, =.99815 
х2 = 0, уз =.99987 
x3— 10, уз =.99973 
x4— 20, уд =.99823 
x5= 30, уз =.99567 
Then 
6[y1—2y2--y3]/ h^ = —.0001116 
6[y3—2y3--y4]/h^ = —.0000816 
6[ya —2y4--y5]/h? = —.0000636 
and the linear system 21 for the parabolic runout spline becomes 
510] M2 —.0001116 
14 1|| 3 | = | —.0000816 
0 1 5|| M4 — 0000636 


Solving this system yields 
M= — 00001973 
Мз = — 00001293 
M4- — .00001013 
From 19 and 20, we have 
My, = М = — 00001973 
Ms=M4= — 00001013 
Solving for the a's, 5;'s, С;'5, and @,'s in 14, we obtain the following expression for the interpolating parabolic 


runout spline: 


—.00000987(x + 10)? --.0002707(x + 10) +.99815, —10<х<0 
.000000113(x — 0)? —.00000987(х — 0)? + .0000733(x — 0) +.99987, 0<х<10 
.000000047(x — 10)? —.00000647(x — 10)? — .0000900(x — 10) +.99973, 10<х< 20 
—.00000507(x — 20)? —.0002053(x — 20) --.99823, 20<х «30 


This spline is plotted in Figure 10.4.6. From that figure we see that the maximum is attained in the interval 
[0, 10]. To find this maximum, we set S" (x) equal to zero in the interval [0, 10]: 
$'(х) = ‚000000339х2 — .0000197x +-.0000733 = 0 


To three significant digits the root of this quadratic in the interval [0, 10] is x — 3,99, and for this value of x, 
S(3.99) = 1.00001. Thus, according to our interpolated estimate, the maximum density of water is 

1.00001 g/ cm? attained at 3.99°С. This agrees well with the experimental maximum density of 

1.00000 g / ст? attained at 3.98°С. (In the original metric system, the gram was defined as the mass of one cubic 


centimeter of water at its maximum density.) 
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0.99800 


0.99700 
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0.99600 


0.99500 
10 0 10 20 30 


Temperature (°C) 


Figure 10.4.6 


Closing Remarks 


In addition to producing excellent interpolating curves, cubic splines and their generalizations are useful for numerical 
integration and differentiation, for the numerical solution of differential and integral equations, and in optimization theory. 


Exercise Set 10.4 


1. Derive the expressions for 2; and с; in Equations 14 of Theorem 10.4.1. 

2. The six points 
(0,.00000), (.2,.19867), (.4,.38942), 
(.6,.56464), (.8,.71736), (1.0, .84147) 


lie оп the graph of y = sin х, where х is in radians. 


(a) Find the portion of the parabolic runout spline that interpolates these six points for 4 < х < .6. Maintain an accuracy of 
five decimal places in your calculations. 


(b) Calculate S(.5) for the spline you found in part (a). What is the percentage error of S(.5) with respect to the “exact” 
value of sin(.5) = .47943? 


Answer: 


(a) S(x) = —.12643(x — 4)? —.20211(х— 4)? 4-.92158(x — .4) + .38942 
(b) 5(.5) = .47943, error = 0% 
3. The following five points 


(0, 1), (1, 7), (2, 27), (3, 79), (4, 181) 


lie on a single cubic curve. 


(a) Which of the three types of cubic splines (natural, parabolic runout, or cubic runout) would agree exactly with the single 
cubic curve on which the five points lie? 


(b) Determine the cubic spline you chose in part (a), and verify that it is a single cubic curve that interpolates the five points. 


Answer: 


(a) The cubic runout spline 


(b) S(x) = 3x? 22x? 45x41 


4. Repeat the calculations in Example 1 using a natural spline to interpolate the five data points. 


Answer: 
— .00000042(х + 10)? + .000214(x+10) + .99815, —10<х<0 
8 .00000024(x)? = .0000126(х)2 + .000088(х) + .99987, 0<х<10 
Хх) = 
— .00000004(х = 10)2 — .0000054(х = 10)2 — .000092(х = 10) + .99973, 10<x<20 
.00000022(x — 20)?  — .0000066(х = 20)2 —  .000212(x —20) + .99823, 20<x<30 


Maximum at (x, S(x)) = (3.93, 1.00004) 


5. Repeat the calculations in Example 1 using a cubic runout spline to interpolate the five data points. 


Answer: 
.00000009(x + 10)? =  .0000121í(x + 10)? + .000282(x--10) + .99815, —10<х<0 
5 .00000009 (x)? —  .0000093(x)? -- .000070(x) + 99987, 0<х<10 
хх) = 
.00000004(x = 10)?  — .0000066(х = 10)2 —  .000087(x—10) + .99973, 10<x<20 
.00000004 (x — 20)? =  .0000053(x = 20)? =  .000207(x—20) + .99823, 20<х< 30 


Maximum at (x, S(x)) = (4.00, 1.00001) 

6. Consider the five points (0, 0), (.5, 1), (1, 0), (1.5, = 1), and (2, 0) on the graph of y = sin(zx)- 
(a) Use a natural spline to interpolate the data points (0, 0), (.5, 1), and (1, 0). 
(b) Use a natural spline to interpolate the data points (.5, 1), (1, 0), and (1.5, — 1). 


(c) Explain the unusual nature of your result in part (b). 


Answer: 


(a) —4x? 4 3x 0<х<0.5 
S(x)- Е 3 
4х7 = 12х + 9x C1 05<х<1 


(b) 2—2x 05<х<1 
Si — 
(x) fae 1<х<15 


(с) The three data points аге collinear. 


7. (The Periodic Spline) If it is known or if it is desired that the n points (x1, y1), (x2, Y2), -- (Ху. Ум) to be interpolated lie 
on a single cycle of a periodic curve with period x; — X, then an interpolating cubic spline S(x) must satisfy 


S(x1) = Sn) 
S'(x1) =S" (xg) 
S" (1) — S" (xy) 
(a) Show that these three periodicity conditions require that 
У1 = Yn 
Mi = М, 


4M 1+ Мз+ Mua = 6n- 291 +y) 1? 


(b) Using the three equations in part (a) and Equations 15, construct an (м — 1) x (м — 1) linear system for 
Mi, М, .... My іп matrix form. 


Answer: 

b = E 

(b) 4100---0001 М, J»n-1 2у1 + J2 
1410 ооо о || M2 Xi = 2y2 + УЗ 
014 1 0000 M3 -É Wa. = 2y3 4 Y4 

: i E д? : 

0000 014 1| My-2 Jn-3 = 2ум—2 + Yn- 
1000 0014 My JXn-2 — Yni ! J1 


8. (The Clamped Spline) Suppose that, in addition to the n points to be interpolated, we are given specific values yi and yh for 
the slopes S" (x 1) and S' (хь) of the interpolating cubic spline at the endpoints X апі X. 


(a) Show that 
2Mi M3 = 62-71 йу) ih’ 
2М„+ Mya = 6(уһ-1 -yn hy) th? 


(b) Using the equations in part (а) and Equations 15, construct an у x з linear system for M, №3, ..., M, in matrix form. 


Remark The clamped spline described in this exercise is the most accurate type of spline for interpolation work if the 
slopes at the endpoints are known or can be estimated. 


Answer: 

(b) - PT 
2100-.--.00201] 21 Ay, n + » 
1410 000 M; e. = 2y2 + ya 
0141 0000| Мз | 6 y = 2y3 + ya 

Pod he : 
0000 004 1| M44 Jn-2 = 2yn-4 + Jn 
0000 0112 
My J»-1 =“ Yn і hyp 


Section 10.4 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, 
Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some 
linear algebra capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are 
using. The goal of these exercises is to provide you with a basic proficiency with your technology utility. Once you have 
mastered the techniques in these exercises, you will be able to use your technology utility to solve many of the problems in the 
regular exercise sets. 


T1. In the solution of the natural cubic spline problem, it is necessary to solve a system of equations having coefficient matrix 


410..00 0 
141..0 0 0 
A= i to tos bo boi 
000..14 1 
000..014 
If we can present a formula for the inverse of this matrix, then the solution for the natural cubic spline problem can be easily 
obtained. In this exercise and the next, we use a computer to discover this formula. Toward this end, we first determine an 


expression for the determinant of A,,, denoted by the symbol D. Given that 


dipl add = [| A 


we see that 
Dy = det( A1) = det[4] = 4 


and 


Dy = det( A2) = | | = 15 


(a) Use the cofactor expansion of determinants to show that 
Dy = 4Dy_-1 — Dy-2 
for я = 3, 4, 5, .... This says, for example, that 
D4—4D4,—D,—4(15) —4 = 56 
D4—4D4 — Ро = 4(56) — 15 = 209 
and so on. Using a computer, check this result for 5 < » < 10. 
(b 


wm 


By writing 
Dy =4Dy-1 — Dy-2 


and the identity, D, = D, ,, in matrix form, 


Feel tars || 
ВЕЧЕ 


(c) Use the methods in Section 5.2 and a computer to show that 
n-1 n-1 n—2 n—2 
(2493) -Q-J3 (@-үз) -Qe«» 
n-2 n—2 n—3 п-3 
An Q-/3 '-Q-j3 'Q-/5 -Q-«/» 


show that 


1 0 
апа һепсе 
+1 +1 
"ee Q-/25" -o-43" 
„= 
2/3 
for» = 1,2, 3,.... 


(d) Using a computer, check this result for 1 < » < 10. 


T2. In this exercise, we determine a formula for calculating Ал! from D fork = 0, 1, 2, 3, ..., п, assuming that Dg is defined 
to be 1. 


(a) Use a computer to compute Ay 1 for  — 1, 2, 3, 4, and 5. 
(b) From your results in part (a), discover the conjecture that 
-1 
A, = [o] 
where ij = (5; and 


el Dons 
месон аа) 


fori <j. 


(c) Use the result in part (b) to compute Аг 1 and compare it to the result obtained using ће computer. 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


10.5 Markov Chains 


In this section we describe a general model of a system that changes from state to state. We then apply the model 
to several concrete problems. 


Prerequisites 


Linear Systems 
Matrices 


Intuitive Understanding of Limits 


A Markov Process 


Suppose a physical or mathematical system undergoes a process of change such that at any moment it can occupy 
one of a finite number of states. For example, the weather in a certain city could be in one of three possible 
states: sunny, cloudy, or rainy. Or an individual could be in one of four possible emotional states: happy, sad, 
angry, or apprehensive. Suppose that such a system changes with time from one state to another and at scheduled 
times the state of the system 1s observed. If the state of the system at any observation cannot be predicted with 
certainty, but the probability that a given state occurs can be predicted by just knowing the state of the system at 
the preceding observation, then the process of change is called a Markov chain or Markov process. 


DEFINITION 1 


If a Markov chain has k possible states, which we label as 1, 2, ..., &, then the probability that the system 
is in state i at any observation after it was in state j at the preceding observation is denoted by Pi and is 
called the transition probability from state j to state i. The matrix P = [ Py] is called the transition 


matrix of the Markov chain. 


For example, in a three-state Markov chain, the transition matrix has the form 
Preceding State 
1 2 3 


Pil P12 P13 1 
P21 P22 P23) 2 New State 
P31 P32 P33 3 


In this matrix, 232 is the probability that the system will change from state 2 to state 3, #11 is the probability that 
the system will still be in state 1 if it was previously in state 1, and so forth. 


EXAMPLE 1 Transition Matrix of the Markov Chain — 


A car rental agency has three rental locations, denoted by 1, 2, and 3. A customer may rent a car 
from any of the three locations and return the car to any of the three locations. The manager finds 
that customers return the cars to the various locations according to the following probabilities: 


Rented from Location 
box s 
8 3 2 1 Returned 
1 2 .6 2 to 
1 25.2 3 Location 


This matrix is the transition matrix of the system considered as a Markov chain. From this matrix, 
the probability is 6 that a car rented from location 3 will be returned to location 2, the probability 
is 8 that a car rented from location 1 will be returned to location 1, and so forth. 


EXAMPLE 2 Transition Matrix of the Markov Chain — 


By reviewing its donation records, the alumni office of a college finds that 80% of its alumni who 
contribute to the annual fund one year will also contribute the next year, and 30% of those who do 
not contribute one year will contribute the next. This can be viewed as a Markov chain with two 
states: state 1 corresponds to an alumnus giving a donation in any one year, and state 2 corresponds 
to the alumnus not giving a donation in that year. The transition matrix is 


P-[2 3, 


In the examples above, the transition matrices of the Markov chains have the property that the entries in any 
column sum to 1. This is not accidental. If P = [ Py] is the transition matrix of any Markov chain with К states, 
then for each j we must have 


pij + Pa ...4 Рк = 1 (1) 


because if ће system is in state j at one observation, it is certain to be in one of the k possible states at the next 
observation. 


A matrix with property 1 is called a stochastic matrix, a probability matrix, or a Markov matrix. From the 
preceding discussion, it follows that the transition matrix for a Markov chain must be a stochastic matrix. 


In a Markov chain, the state of the system at any observation time cannot generally be determined with certainty. 
The best one can usually do is specify probabilities for each of the possible states. For example, in a Markov 


chain with three states, we might describe the possible state of the system at some observation time by a column 
vector 


х] 
х= | х2 
х3 


in which X 1 is the probability that the system is in state 1, X2 the probability that it is in state 2, and х3 the 
probability that it is in state 3. In general we make the following definition. 


DEFINITION 2 
The state vector for an observation of a Markov chain with k states is a column vector x whose ith 


component X; is the probability that the system 15 in the ith state at that time. 


Observe that the entries in any state vector for a Markov chain are nonnegative and have a sum of 1. (Why?) A 
column vector that has this property is called a probability vector. 


Let us suppose now that we know the state vector x for a Markov chain at some initial observation. The 


following theorem will enable us to determine the state vectors 


at the subsequent observation times. 


THEOREM 10.5.1 


If P is the transition matrix of a Markov chain and „®% is the state vector at the nth observation, then 


QD — p, 00. 


The proof of this theorem involves ideas from probability theory and will not be given here. From this theorem, 
it follows that 


xD — pO 

x 2 — py(D — PhO 
xO — рх = pA, O 
x — Py@-) — pr, ® 


In this way, the initial state vector x) and the transition matrix P determine х? for » = 1, 2, .... 


EXAMPLE 3 Example 2 Revisited + 


P-[2 3, 


We now construct the probable future donation record of a new graduate who did not give a donation in the 


The transition matrix in Example 2 was 


initial year after graduation. For such a graduate the system is initially in state 2 with certainty, so the initia 
vector is 


From Theorem 10.5.1 we then have 
(D. pO 8 3 О = Е: 
кс Ё J f 7 
20 руй 8 .3||.3 _ 45 
i E 211371 L55 
Oa- 8 .3||.45 - .525 
х Е 211.55 | |.475 
Thus, after three years the alumnus can be expected to make a donation with probability .525. Beyond thre 
years, we find the following state vectors (to three decimal places): 


(4 _ [.563 (5) _ | 581 (6 _|.°91 om 
i ВЧ т Е = = | 409 |" i 
(®) _ 598 H 599 (10) _ 599 aD. 
х ЕЯ = ps | E 401 |’ ^ 


For all n beyond 11, we have 
(jj. | .600 
ХУ = 
b 
to three decimal places. In other words, the state vectors converge to a fixed vector as the number of 
observations increases. (We will discuss this further below.) 


EXAMPLE 4 Example 1 Revisited — 


The transition matrix in Example 1 was 


© ao „2 
1 „2.6 
1 2X. 
If a car is rented initially from location 2, then the initial state vector is 
0 
xD = |1 
0 


Using this vector and Theorem 10.5.1, one obtains the later state vectors listed in Table 1. 


Table 1 


For all values of n greater than 11, all state vectors are equal to x“) to three decimal places. 


Two things should be observed in this example. First, it was not necessary to know how long a customer kt 
the car. That is, in a Markov process the time period between observations need not be regular. Second, the 
state vectors approach a fixed vector as n increases, just as in the first example. 


EXAMPLE 5 Using Theorem 10.5.1 «4 


A traffic officer is assigned to control the traffic at the eight intersections indicated in Figure 10.5.1. 
She is instructed to remain at each intersection for an hour and then to either remain at the same 
intersection or move to a neighboring intersection. To avoid establishing a pattern, she is told to 
choose her new intersection on a random basis, with each possible choice equally likely. For example, 


if she is at intersection 5, her next intersection can be 2, 4, 5, or 8, each with probability re Every day 


she starts at the location where she stopped the day before. The transition matrix for this Markov chain 
is 


Old Intersection 

1234567 8 
1101000 0 
11001000 
0о0о1171010 0 : 
101110403; New 
0 i 0 i i 0 o 1 Re 
coo0looilio р 
оодо 
сооо20 27 


{>> 


IOL 
JOOL 


Bum шат 


Figure 10.5.1 
If the traffic officer begins at intersection 5, her probable locations, hour by hour, are given by the 
state vectors given in Table 2. For all values of n greater than 22, all state vectors are equal to x(22 to 


three decimal places. Thus, as with the first two examples, the state vectors approach a fixed vector as 
n increases. 


Table 2 


.000 
‚250 
.000 


.250 
.250 
.000 
.000 
‚250 


Limiting Behavior of the State Vectors 


In our examples we saw that the state vectors approached some fixed vector as the number of observations 
increased. We now ask whether the state vectors always approach a fixed vector in a Markov chain. A simple 
example shows that this is not the case. 


EXAMPLE 6 System Oscillates Between Two State Vectors + 


"o ^d 


Then, because p2 — 7 and p? — p, we have that 


Let 


=x =20 =.= 19) 


and 


0-09-01} 


This system oscillates indefinitely between the two state vectors H and К 


{! so it does not 


approach any fixed vector. 


However, if we impose a mild condition on the transition matrix, we can show that a fixed limiting state vector is 
approached. This condition is described by the following definition. 


DEFINITION 3 


A transition matrix is regular if some integer power of it has all positive entries. 


Thus, for a regular transition matrix P, there is some positive integer m such that all entries of P" are positive. 
This is the case with the transition matrices of Examples 1 and 2 for jj; = 1. In Example 5 it turns out that p^ has 
all positive entries. Consequently, in all three examples the transition matrices are regular. 


A Markov chain that is governed by a regular transition matrix is called a regular Markov chain. We will see 
that every regular Markov chain has а fixed state vector q such that Р”) approaches q as n increases for any 


choice of x). This result is of major importance in the theory of Markov chains. It is based on the following 
theorem. 


THEOREM 10.5.2 Behavior of P" as џ — со 


If P is a regular transition matrix, then as x — 00, 


41 di - di 
р", 42 92 --- 22 
Jk Gk --- dk 
where the 2; are positive numbers such that д + @2 -- ...-- gg = 1. 


We will not prove this theorem here. We refer you to a more specialized text, such as J. Kemeny and J. Snell, 
Finite Markov Chains (New York: Springer-Verlag, 1976). 


Let us set 


di di --- di di 


42 42 =. 42 42 
Q=] | апа а= 


Gk Gk - dk Fk 


Thus, Q is a transition matrix, all of whose columns are equal to the probability vector q. О has the property that 
if x 1s any probability vector, then 


41 gi --- 1 |[xy 41x1 ł 41x2 F...4 q1Xk 
Diss a n a ie a Е ыч ł k F...4 an 
dk dk e dk Xk "tT 4 4кх2 +... + dX 
41 
= (куха) |42 |=()a=a 
dk 


That is, Q transforms any probability vector x into the fixed probability vector q. This result leads to the 
following theorem. 


THEOREM 10.5.3 Behavior of Рх as H— co 


If P is a regular transition matrix and x is any probability vector, then as у —› со 
91 


where q is a fixed probability vector, independent of n, all of whose entries are positive. 


This result holds since Theorem 10.5.2 implies that P" — (2 as » — со. This in turn implies that P"x — Ох = q 


as з — со. Thus, for a regular Markov chain, the system eventually approaches a fixed state vector q. The vector 
q is called the steady-state vector of the regular Markov chain. 


For systems with many states, usually the most efficient technique of computing the steady-state vector q is 
simply to calculate P"x for some large n. Our examples illustrate this procedure. Each is a regular Markov 
process, so that convergence to a steady-state vector is ensured. Another way of computing the steady-state 
vector is to make use of the following theorem. 


THEOREM 10.5.4 Steady-State Vector 


The steady-state vector q of a regular transition matrix P is the unique probability vector that satisfies the 
equation Pq = q. 


To see this, consider the matrix identity pp? — p*+!. By Theorem 10.5.2, both P” and p*+! approach О as 

п — со. Thus, we have РО = 0. Any one column of this matrix equation gives Pq = 9. To show that q is the 
only probability vector that satisfies this equation, suppose г is another probability vector such that Py = ү. Then 
also P"r =r for x = 1, 2, .... When we let y — oo, Theorem 10.5.3 leads to 9 = г. 


Theorem 10.5.4 can also be expressed by the statement that the homogeneous linear system 

(= P)q—0 
has a unique solution vector q with nonnegative entries that satisfy the condition g4 + g2 +... + gj = 1. We can 
apply this technique to the computation of the steady-state vectors for our examples. 


EXAMPLE 7 Example 2 Revisited + 


8 .3 
Р= |` 
[25 
so the linear system (7 — P)q = 0 is 


E |= й 0) 


This leads to the single independent equation 


In Example 2 the transition matrix was 


291 —.392—0 
Or 
41= 1.542 


Thus, when we set 22 = 5, any solution of 2 is of the form 


v 


where s is an arbitrary constant. To make the vector q a probability vector, we set 
== 1/(1.5 + 1) —.4. Consequently, 
_ | .6 
17|4 


is the steady-state vector of this regular Markov chain. This means that over the long run, 6096 of 
the alumni will give a donation in any one year, and 40% will not. Observe that this agrees with the 
result obtained numerically in Example 3. 


EXAMPLE 8 Example 1 Revisited + 


In Example 1 the transition matrix was 


Uh N Ww 
RO с N 


so the linear system (7 — Р) = 0 15 


2 —53 =.2 ||91 0 
—1 8 —6/||[22|—|0 
—-1-5 8 |93 0 


The reduced row echelon form of the coefficient matrix 15 (verify) 


34 
1 0-2 
14 
01-3 
00 о 


so the original linear system 1s equivalent to the system 


When we set 43 = 5, any solution of the linear system is of the form 


34 
13 


34 


ou ke 
q=| = | = | .2295... 
alak. 


This agrees with the result obtained numerically in Table 1. The entries of q give the long-run 
probabilities that any one car will be returned to location 1, 2, or 3, respectively. If the car rental 
agency has a fleet of 1000 cars, it should design its facilities so that there are at least 558 spaces at 
location 1, at least 230 spaces at location 2, and at least 214 spaces at location 3. 


EXAMPLE 9 Example 5 Revisited <4 


We will not give the details of the calculations but simply state that the unique probability vector 
solution of the linear system (7 — P)q = 0 is 


28 
E 
1071... 
5g 1071... 
5 1071... 
_ | 28 |. |.1785... 
= |а |- | 1428... 
28 1071... 
BER ‚1428... 
28 1071... 
4A 
28 
ES 
28 


The entries in this vector indicate the proportion of time the traffic officer spends at each 
intersection over the long term. Thus, if the objective 1s for her to spend the same proportion of 
time at each intersection, then the strategy of random movement with equal probabilities from one 
intersection to another is not a good one. (See Exercise 5.) 


Exercise Set 10.5 


1. Consider the transition matrix 


1 
(а) Calculate x@ for » = 1, 2, 3, 4, 5 itx D = Hi 


(b) State why P is regular and find its steady-state vector. 


Answer: 
(a) (D. |.4 (2) __ | 46 B | .454 (4 _ | 4546 (5) _ [ 45454 
x Р di s4) * 546|’ * 5454| * 54546 
(b) 5 
P is regular since all entries of P are positive; q = А 
11 
2. Consider the transition matrix 
i ul. ud 
Р=|6 4 2 
"e al 


(a) Calculate x“), х), and 49 to three decimal places if 


0 
x® = 0 
1 


(b) State why P is regular and find its steady-state vector. 


Answer: 
(a) d 23 ‚27% 
x) =| 2|, x9-—|.52|, x =| 396 
1 25 331 
(b) 22 
72 
P is regular, since all entries of P are positive: q = 2 
2l 
72 
3. Find the steady-state vectors of the following regular transition matrices: 
(a) |1 3 
3 4 
2 1 
3 4 
(b) |.81 .26 
19 74 
(ji 1 
4.4 0 
1 T 
3 j 4 
A ® 
3 2 4 
Answer 
(a) |.9. 
17 
8. 
17 
(6) |26 
45 
19 


(с) 


i pb — 
IS |е Sy 


—. 


9 


4. Let P be the transition matrix 


(a) Show that P 15 not regular. 

(b) Show that as n increases, Р” approaches H for any initial state vector х). 

(c) What conclusion of Theorem 10.5.3 is not valid for the steady state of this transition matrix? 
Answer: 


я | Ba 0 


= ‚ #=1,2,.... Thus, no integer power of P has all positive entries. 
0 0 0 
(b) Р", [ ] as n increases, so F "0, H for any x) as n increases. 


(c) The entries of the limiting vector H are not all positive. 


5. Verify that if P is a & x regular transition matrix all of whose row sums are equal to 1, then the entries of its 
steady-state vector are all equal to 1 / x. 


6. Show that the transition matrix 


l1 
d 
2i 
Р= | 5 0 
T ọ 1 
205 


is regular, and use Exercise 5 to find its steady-state vector. 


Answer: 


pi- 


has all positive entries; q = 


Ble Ble ple 
Mle Ble Blo 


7. John is either happy or sad. If he is happy one day, then he is happy the next day four times out of five. If he 
is sad one day, then he is sad the next day one time out of three. Over the long term, what are the chances that 
John is happy on any given day? 


Answer: 


10 
13 

8. A country is divided into three demographic regions. It is found that each year 5% of the residents of region 1 
move to region 2, and 5% move to region 3. Of the residents of region 2, 15% move to region | and 10% 
move to region 3. And of the residents of region 3, 10% move to region 1 and 5% move to region 2. What 
percentage of the population resides in each of the three regions after a long period of time? 


Answer: 


541% in region 1, 162% in region 2, and 291% in region 3 


Section 10.5 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the relevant 
documentation for the particular utility you are using. The goal of these exercises is to provide you with a basic 
proficiency with your technology utility. Once you have mastered the techniques in these exercises, you will be 
able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. Consider the sequence of transition matrices 
(P2, Рз, Pa, ...) 
with 


1 

‚1 00d 

Р») = ae Pa —|0 14 1 
МЕ 2 3[ 

2 TE 

2 3 

00004d 

0004 И 
611 ооо 1 

3 4 11 1 
SE siad db ap з =| 02 2 st 
23 4 Pa ae ee 

114 1 1 2 34 5 
234 "ETYM 
23435 


and so on. 

(a) Use a computer to show that each of these four matrices is regular by computing their squares. 

(b) Verify Theorem 10.5.2 by computing the 100th power of P} for k = 2, 5, 4, 5. Then make a conjecture as to 
the limiting value of РЁ as 4 — oo for all $ = 2, 5,4, ... 

(c) Verify that the common column 4% of the limiting matrix you found in part (b) satisfies the equation 
Pyqi = qi, as required by Theorem 10.5.4. 

T2. A mouse is placed in a box with nine rooms as shown in the accompanying figure. Assume that it is equally 

likely that the mouse goes through any door in the room or stays in the room. 

(a) Construct the 9 м 9 transition matrix for this problem and show that it is regular. 

(b) Determine the steady-state vector for the matrix. 


(c) Use a symmetry argument to show that this problem may be solved using only a 3 x 3 matrix. 


Figure Ex-T2 
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10.6 Огарһ Тһеогу 


In this section we introduce matrix representations of relations among members of a set. We use matrix 
arithmetic to analyze these relationships. 


Prerequisites 


Matrix Addition and Multiplication 


Relations Among Members of a Set 


There are countless examples of sets with finitely many members in which some relation exists among 
members of the set. For example, the set could consist of a collection of people, animals, countries, 
companies, sports teams, or cities; and the relation between two members, A and B, of such a set could be that 
person А dominates person В, animal А feeds on animal B, country А militarily supports country B, company 
А sells its product to company В, sports team А consistently beats sports team B, or city А has a direct airline 
flight to city B. 


We will now show how the theory of directed graphs can be used to mathematically model relations such as 
those in the preceding examples. 


Directed Graphs 


A directed graph is a finite set of elements, (P1, P5, ..., Ру}, together with a finite collection of ordered 
pairs (Ру, Ру) of distinct elements of this set, with no ordered pair being repeated. The elements of the set are 
called vertices, and the ordered pairs are called directed edges, of the directed graph. We use the notation 

Р; — Ру (which is read “Р, is connected to Pj") to indicate that the directed edge (Р;, Ру) belongs to the 
directed graph. Geometrically, we can visualize a directed graph (Figure 10.6.1) by representing the vertices 
as points in the plane and representing the directed edge P; — P у by drawing a line or arc from vertex P; to 
vertex Pj, with an arrow pointing from P; to Ру. If both P; — P; and P; — Р; hold (denoted P; +» Ру), we 
draw a single line between P; and Р; with two oppositely pointing arrows (as with P5 and Рз in the figure). 


Figure 10.6.1 


As in Figure 10.6.1, for example, a directed graph may have separate “components” of vertices that are 
connected only among themselves; and some vertices, such as Ps, may not be connected with any other 
vertex. Also, because P; — P; is not permitted in a directed graph, a vertex cannot be connected with itself by 
a single arc that does not pass through any other vertex. 


Figure 10.6.2 shows diagrams representing three more examples of directed graphs. With a directed graph 
having n vertices, we may associate ап рр x » matrix M = [;], called the vertex matrix of the directed 
graph. Its elements are defined by 


1; ҰР; + Ps 
Mij = | 
0, otherwise 
for i, j = 1, 2, ..., м. For the three directed graphs in Figure 10.6.2, the corresponding vertex matrices аге 
0100 
; | 10010 
Figure a: М = 01201 
000 0 
| 10040. 1 
00110 
Figure b: M=|0 0010 
01001 
01100 
0100 
1010 
F | М = 
pu 1001 
1-49: 0 


и Pi 
P, P, 


(a) 


P3 
P, P4 
Р, 
P. 


(b) 


Р, 
Р, 
Р. 
Р, 


(с) 
Figure 10.6.2 


By their definition, vertex matrices have the following two properties: 
(i) All entries are either 0 or 1. 
(ii) All diagonal entries are 0. 


Conversely, any matrix with these two properties determines a unique directed graph having the given matrix 
as its vertex matrix. For example, the matrix 


о н оо 
оо о н 
оо н н 
о н о о 


determines the directed graph in Figure 10.6.3. 


Р, 


Р, 
Р, 


Figure 10.6.3 


EXAMPLE 1 Influences Within a Family + 


A certain family consists of a mother, father, daughter, and two sons. The family members have 
influence, or power, over each other in the following ways: the mother can influence the 
daughter and the oldest son; the father can influence the two sons; the daughter can influence 
the father; the oldest son can influence the youngest son; and the youngest son can influence the 
mother. We may model this family influence pattern with a directed graph whose vertices are 
the five family members. If family member A influences family member В, we write 4 — 2. 
Figure 10.6.4 is the resulting directed graph, where we have used obvious letter designations for 
the five family members. The vertex matrix of this directed graph is 


MF DOS ҮЗ 
М |0 01 1 0 
F |O 00 1 1 
P10 100 0 
OS |0 00 Q 1 
S |1 00 0 0 
M YS 
D F 


Figure 10.6.4 


EXAMPLE 2 Vertex Matrix: Moves оп а Сһеѕѕроага <& 


In chess the knight moves in an “L”-shaped pattern about the chessboard. For the board in 
Figure 10.6.5 it may move horizontally two squares and then vertically one square, or it may 
move vertically two squares and then horizontally one square. Thus, from the center square in 
the figure, the knight may move to any of the eight marked shaded squares. Suppose that the 
knight is restricted to the nine numbered squares in Figure 10.6.6. If by і — 7 we mean that the 
knight may move from square i to square j, the directed graph in Figure 10.6.7 illustrates all 


possible moves that the knight may make among these nine squares. In Figure 10.6.8 we have 
“unraveled” Figure 10.6.7 to make the pattern of possible moves clearer. 


The vertex matrix of this directed graph is given by 


0000201010 
00000010 1 
000100010 
00100000 1 
M=|0 00000000 
1000001 0 0 
01000100 0 
1010000 0 0 
0101000 0 0 


Figure 10.6.7 


2 


Figure 10.6.8 


In Example 1 the father cannot directly influence the mother; that 15, F — Д is not true. But he can influence 
the youngest son, who can then influence the mother. We write this as F —, ¥S'—, M and call it a 2-step 
connection from F to M. Analogously, we call M — D a I-step connection, F — OS — YS — M a 3-step 
connection, and so forth. Let us now consider a technique for finding the number of all possible r-step 
connections (ғ = 1, 2, ...) from one vertex P; to another vertex P j of an arbitrary directed graph. (This will 
include the case when P; and P у аге the same vertex.) The number of 1-step connections from P; to Р. j 15 
simply "ij. That is, there is either zero or опе 1-step connection from P; to Ру, depending on whether у is 
zero or one. For the number of 2-step connections, we consider the square of the vertex matrix. If we let т 
be the (i, j)-th element of 147, we have 


"y = msm; F 7925270895 P -E PR iy Py (1) 
Now, if #31 = n1; = 1, there is a 2-step connection P; — P, — Р, from Р; to Ру. But if either т or 1; is 
zero, such a 2-step connection is not possible. Thus P; — P — Ру is a 2-step connection if and only if 
Milij = 1. Similarly, for any & = 1, 2,..., м, Py — Py — Py is a 2-step connection from Р; to P; if and 
only if the term ??;k?? ky on the right side of 1 is one; otherwise, the term is zero. Thus, the right side of] is 
the total number of two 2-step connections from Р; to Р. 


A similar argument will work for finding the number of 3 — , 4 — , ..., r-step connections from Р; to Py. In 
general, we have the following result. 


THEOREM 10.6.1 


Let M be the vertex matrix of a directed graph and let d be the (i, j)-th element of M". Then "у 
is equal to the number of r-step connections from P; to P }- 


EXAMPLE 3 Using Theorem 10.6.1 < 


Figure 10.6.9 is the route map of a small airline that services the four cities P1, P5, Рз, Рд. As 
a directed graph, its vertex matrix is 


We have that 


If we are interested in connections from city P4 to city P5, we may use Theorem 10.6.1 to find 
their number. Because 7243 = 1, there is one 1-step connection; because me — 1, there is one 


2-step connection; and because @ — 3, there are three 3-step connections. To verify this, 


from Figure 10.6.9 we find 


1-step connections from P4to Рз: Ра — Рз 
2-step connections from Рд to Рз: P4 — Pa — Рз 
3-step connections from Рд to Рз: P4— P4— P4— P3 


P43 Ра + Ру + Р» 
P4— Рз — Ру Рз 


Р, 


Figure 10.6.9 


Cliques 


In everyday language a “clique” is a closely knit group of people (usually three or more) that tends to 
communicate within itself and has no place for outsiders. In graph theory this concept is given a more precise 
meaning. 


DEFINITION 1 


A subset of a directed graph is called a clique if it satisfies the following three conditions: 
(i) The subset contains at least three vertices. 
(ii) For each pair of vertices P; and Р, in the subset, both P; — P; and P; — Р; are true. 


(iii) The subset is as large as possible; that is, it is not possible to add another vertex to the subset and 
still satisfy condition (ii). 


This definition suggests that cliques are maximal subsets that are in perfect “communication” with each other. 
For example, if the vertices represent cities, and P; — Р j means that there is a direct airline flight from city 
Pj to city P j» then there is a direct flight between any two cities within a clique in either direction. 


EXAMPLE 4 AbDirected Graph with Two Cliques «4 


The directed graph illustrated in Figure 10.6.10 (which might represent the route map of an 
airline) has two cliques: 
(P1, Pa P3, P4) and (Рз, Рд, Pe) 


This example shows that a directed graph may contain several cliques апа that a vertex may 
simultaneously belong to more than one clique. 


Ps 


P, 


P- 


Figure 10.6.10 


For simple directed graphs, cliques can be found by inspection. But for large directed graphs, it would be 
desirable to have a systematic procedure for detecting cliques. For this purpose, it will be helpful to define a 
matrix 5 = [s;;] related to a given directed graph as follows: 


1, if Р; =) P; 
sy = | 
0, otherwise 


The matrix S determines a directed graph that is the same as the given directed graph, with the exception that 
the directed edges with only one arrow are deleted. For example, if the original directed graph is given by 
Figure 10.6.11a, the directed graph that has S as its vertex matrix is given in Figure 10.6.115. The matrix S 
may be obtained from the vertex matrix M of the original directed graph by setting Sjj = lif»; у=ти= 1 
and setting sj; = 9 otherwise. 


Р; 


Р, 


(а) 


2 


Р, 
(b) 
Figure 10.6.11 


The following theorem, which uses the matrix S, is helpful for identifying cliques. 


THEOREM 10.6.2 Identifying Cliques 


Let м be the (i, j)-th element of 53. Then a vertex Р; belongs to some clique if and only if = #0- 


Proof If © + 0, then there is at least one 3-step connection from P; to itself in the modified directed graph 
n 

determined by S. Suppose it is P; — Р; — Py — Pi. In the modified directed graph, all directed relations are 

two-way, so we also have the connections Р; + Р; + Py + Pj. But this means that (Pj, Pj, Fk). is either a 

clique or a subset of a clique. In either case, P; must belong to some clique. The converse statement, "if P; 

belongs to a clique, then a + 0." follows in a similar manner. 


EXAMPLE 5 Using Theorem 10.6.2 < 


Suppose that a directed graph has as its vertex matrix 


0111 
1010 
M-|o 1 
1000 
Then 
0101 0302 
211010 a [3 020 
$-lo100 me mad d 
1000 20 10 
Because all diagonal entries of 5 are zero, it follows from Theorem 10.6.2 that the directed 


graph has no cliques. 


EXAMPLE 6 Using Theorem 10.6.2 < 


Suppose that a directed graph has as its vertex matrix 


01011 

10010 

M=|1 1010 

11000 

10010 

Then 

01011 24043 
10010 42031 
S=|lo 0000 and S=|00000 
11000 43021 
10000 31010 


The nonzero diagonal entries of $3 are se. 59, апа 60. Consequently, in the given directed 
graph, P4, Рэ, and Рд belong to cliques. Because a clique must contain at least three vertices, 


the directed graph has only one clique, (P1, P5, Рад). 


Dominance-Directed Graphs 


In many groups of individuals or animals, there is a definite “pecking order" or dominance relation between 
any two members of the group. That is, given any two individuals А and B, either А dominates B or B 


dominates 4, but not both. In terms of a directed graph in which P; — Р; means P; dominates Ру, this means 
that for all distinct pairs, either P; — Р jor Р шы P;, but not both. In general, we have the following 
definition. 


DEFINITION 2 


A dominance-directed graph is a directed graph such that for any distinct pair of vertices P; and F}, 
either P; — Ру or Pj — P, but not both. 


An example of a directed graph satisfying this definition is a league of n sports teams that play each other 
exactly one time, as in one round of a round-robin tournament in which no ties are allowed. If P; — P j means 
that team Р; beat team P у in their single match, it is easy to see that the definition of a dominance-directed 
group is satisfied. For this reason, dominance-directed graphs are sometimes called tournaments. 


Figure 10.6.12 illustrates some dominance-directed graphs with three, four, and five vertices, respectively. In 
these three graphs, the circled vertices have the following interesting property: from each one there is either a 
l-step or a 2-step connection to any other vertex in its graph. In a sports tournament, these vertices would 
correspond to the most "powerful" teams in the sense that these teams either beat any given team or beat 
some other team that beat the given team. We can now state and prove a theorem that guarantees that any 
dominance-directed graph has at least one vertex with this property. 


THEOREM 10.6.3 Connections in Dominance-Directed Graphs 


In any dominance-directed graph, there is at least one vertex from which there is a 1-step or 2-step 
connection to any other vertex. 


Proof Consider a vertex (there may be several) with the largest total number of 1-step and 2-step 
connections to other vertices in the graph. By renumbering the vertices, we may assume that P, is such a 
vertex. Suppose there is some vertex Р; such that there is no 1-step or 2-step connection from P to P;. Then, 
in particular, P4 — Р; is not true, so that by definition of a dominance-directed graph, it must be that 

Pj — Р. Next, let Py be any vertex such that Ру — P is true. Then we cannot have P} — P;, as then 

P4 — Py — Р; would be a 2-step connection from P, to P;. Thus, it must be that P; — Ру. That is, P; has 
1-step connections to all the vertices to which P, has 1-step connections. The vertex P; must then also have 
2-step connections to all the vertices to which P has 2-step connections. But because, in addition, we have 
that P; — P1, this means that P; has more 1-step and 2-step connections to other vertices than does Р}. 
However, this contradicts the way in which P, was chosen. Hence, there can be no vertex P; to which P, has 
no 1-step or 2-step connection. 


(с) 
Figure 10.6.12 


This proof shows that a vertex with the largest total number of 1-step and 2-step connections to other vertices 
has the property stated in the theorem. There is a simple way of finding such vertices using the vertex matrix 
M and its square 42. The sum of the entries in the ith row of M is the total number of 1-step connections 


from Р; to other vertices, and the sum of the entries of the ith row of Af 2 is the total number of 2-step 


connections from P; to other vertices. Consequently, the sum of the entries of the ith row of the matrix 
A М + М 2 is the total number of l-step and 2-step connections from Р; to other vertices. In other words, 


a row of А = M + М2 with the largest row sum identifies a vertex having the property stated in Theorem 
10.6.3. 


EXAMPLE 7 Using Theorem 10.6.3 << 


Suppose that five baseball teams play each other exactly once, and the results are as indicated in 
the dominance-directed graph of Figure 10.6.13. The vertex matrix of the graph is 


00110 

10101 

M=|0 0010 

01000 

10110 

so 

00110 01010 01120 
10101 10230 2033 1 
A=M+M7=|0 001 0|+|0 100 0/=|0 1010 
01000 10101 11101 
10110 01120 Ll s: 


The row sums of A are 
lstrow sum — 4 
2 nd row sum — 9 
3rdrow sum = 2 
4 th row sum — 4 
5 th row sum = 7 
Because the second row has the largest row sum, the vertex Pa must have a l-step or 2-step 
connection to any other vertex. This is easily verified from Figure 10.6.13. 


Р, 


Р, Р, 


Figure 10.6.13 


We have informally suggested that a vertex with the largest number of 1-step and 2-step connections to other 
vertices is a "powerful" vertex. We can formalize this concept with the following definition. 


DEFINITION 3 


The power of a vertex of a dominance-directed graph 15 the total number of 1-step and 2-step 
connections from it to other vertices. Alternatively, the power of a vertex Р; is the sum of the entries 
of the ith row of the matrix 4 = M + M^, where M is the vertex matrix of the directed graph. 


EXAMPLE 8 Example 7 Revisited + 


Let us rank the five baseball teams in Example 7 according to their powers. From the 
calculations for the row sums in that example, we have 


Power ofteam P, —4 
Power of team Рэ = 9 
Power of team Рз = 2 
Power of team P4—4 
Power of team P5 = 7 


Hence, the ranking of the teams according to their powers would be 
P^ (first), Ps (second), Ру and P4 (ted for third), P3 (last) 


Exercise Set 10.6 


1. Construct the vertex matrix for each of the directed graphs illustrated in Figure Ex-1. 


Р, 


(а) (b) 


Р, 
Р, 


Р, 
Ps 


(с) 


Figure Ex-1 


Answer: 


or о о о 
т" тсз + © © о чо о 
oro oOo нло о с m 
oor со Tr oc о Oo 
Orr 0 © © r © © 


010100 
1000 0 0 
010111 
000001 
000001 
001010 


(с) | 
2. Draw a diagram of the directed graph corresponding to each of the following vertex matrices. 


С» m т=з OO 
coro O Orc о 
= су Oo rf vr 0 OO = 
—-0 00 O ©з тз © m 
onom оноо ort 
= © 


01010 1 
100010 
0000200 
110010 
000101 
010010 


(c) 


Answer: 


(a) ^ 


P, 


P, 


Р, Р, 
Р, Р, P. 
(c) g e 
Ps Ps P, 


3. Let M be the following vertex matrix of a directed graph: 


(a) Draw a diagram of the directed graph. 


(b) Use Theorem 10.6.1 to find the number of 1-, 2-,and 3-step connections from the vertex P to the 
vertex P^. Verify your answer by listing the various connections as in Example 3. 


(c) Repeat part (b) for the 1-, 2-, and 3-step connections from P to P4. 
Answer: 


(a) 


P. Р, 
(b) om step: Ру + P4 
2 = step: Рү — P4— P3 
Рү- Рз» P3 
3—step: P1 — Pa — Ру Р 
Pi => Рз Pa Pa 
Pi = Ра + Рз Pa 
(c) l= step: Ру — Рд 
2 = step: Рү — Рз — Рд 
3 = step: Py э Pa — Ру Рд 
Pi Ра + P3— Рд 


4. (a) Compute ће matrix product М7 M for the vertex matrix M in Example 1. 


(b) Verify that the kth diagonal entry of M T ag is the number of family members who influence the kth 
family member. Why is this true? 


(c) Find a similar interpretation for the values of the nondiagonal entries of 1f 7 M. 


Answer: 

(a)/1 000 0 
01000 
001 10 
0012 1 
00012 


(c) The i Аһ entry is the number of family members who influence both the ith and jth family members. 


5. By inspection, locate all cliques in each of the directed graphs illustrated in Figure Ex-5. 
Р, 
| u | 
P [A RN P Р; 


(а) (b) 


Р, P, Р, 
P, РЬ Р 


(с) 


5 


Figure Ex-5 


Answer: 


(а) [P Pa P3) 
(b) (P3, Рд, P5) 
(c) (P2, Рд, Pg, Pg) and (Pa, Ps, Pe) 


6. For each of the following vertex matrices, use Theorem 10.6.2 to find all cliques in the corresponding 
directed graphs. 


(аә) 01010 
10101 
0101 1 
1000 1 
10110 

© 1010110 
101011 
01010 1 
101011 
010100 
001110 

Answer: 

(a) None 


(b) (23, Ра, Pe) 


7. For the dominance-directed graph illustrated in Figure Ex-7 construct the vertex matrix and find the power 
of each vertex. 


Р, 
Р, Р 
Figure Ех-7 
Answer: 
0 0 1 1] Power of P1 = 5 
100 0| Power of Р = 3 
0 1 0 1| Power of P3=4 
0 1 0 0 | Power of P4—2 
8. Five baseball teams play each other one time with the following results: 
A beats B, C, D 
B beats C, E 
C beats D, E 
D beats В 
E beats 4, D 


Rank the five baseball teams in accordance with the powers of the vertices they correspond to in the 
dominance-directed graph representing the outcomes of the games. 


Answer: 


First, A; second, B and E (tie); fourth, C; fifth, D 
Section 10.6 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the 
relevant documentation for the particular utility you are using. The goal of these exercises is to provide you 
with a basic proficiency with your technology utility. Once you have mastered the techniques in these 
exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


T1. A graph having n vertices such that every vertex is connected to every other vertex has a vertex matrix 
given by 


011 11..1 
140 d T 1.1 
1 0171.1 
My4—|11101..1 
11110..1 


tid XX. 
In this problem we develop a formula for ME whose (i, j)-th entry equals the number of k-step connections 
from Р; to P. 
(a) Use a computer to compute the eight matrices ME for » = 2, 5 and fork = 2, 3, 4, 5. 


(b) Use the results in part (a) and symmetry arguments to show that ME can be written as 


k 
0111 1...1 
170111. 1 
LUOT Taal 
Мй=|11101..1 
1I Lx dw ЛД] 
LU Ll 4 0 
a; Dk Ark Bk Bx Jk 
Jk Ok Dk Dk Be gk 
Jk Bk ах Bk Bk Jk 
=| ők Bk Bk ok Bx Big 
Jk By Bk бк Qk Fi 


(c) Using the fact that ME — M, MEL, show that 


l-i aaa 


with 
(d) Using part (c), show that 


(e) Use the methods of Section 5.2 to compute 


TEE PS M IE 
1 »-2 


and thereby obtain expressions for окр, and fg, and eventually show that 


Mr = — ia oe 1)"2, 


A 


where U,, is the » x д matrix all of whose entries are ones and 7,, is the »; x » identity matrix. 


(f) Show that for y 2, all vertices for these directed graphs belong to cliques. 


T2. Consider a round-robin tournament among n players (labeled 21, 42, a3, ..., Фу) where 41 beats 02, 47 
beats 23, 23 beats a4, ..., @и—1 beats а„, and @» beats 41. Compute the “power” of each player, showing that 
they all have the same power; then determine that common power. 

[Hint: Use a computer to study the cases м = 3, 4, 5, 6; then make a conjecture and prove your conjecture to 
be true.] 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


10.7 Games of Strategy 


In this section we discuss a general game in which two competing players choose separate strategies to reach opposing 
objectives. The optimal strategy of each player is found in certain cases with the use of matrix techniques. 


Prerequisites 


Matrix Multiplication 
Basic Probability Concepts 


Game Theory 


To introduce the basic concepts in the theory of games, we will consider the following carnival-type game that two 
people agree to play. We will call the participants in the game player R and player C. Each player has a stationary wheel 
with a movable pointer on it as in Figure 10.7.1. For reasons that will become clear, we will call player A's wheel the 
row-wheel and player C's wheel the column-wheel. The row-wheel is divided into three sectors numbered 1, 2, and 3, 
and the column-wheel is divided into four sectors numbered 1, 2, 3, and 4. The fractions of the area occupied by the 
various sectors are indicated in the figure. To play the game, each player spins the pointer of his or her wheel and lets it 
come to rest at random. The number of the sector in which each pointer comes to rest is called the move of that player. 
Thus, player Ё has three possible moves and player C has four possible moves. Depending on the move each player 
makes, player C then makes a payment of money to player Ё according to Table 1. 


Row-whecl 
of player R 


Column-wheel 
of player C 


Figure 10.7.1 


Table 1 


Player C's Move 


Player A's 


For example, if the row-wheel pointer comes to rest in sector 1 (player R makes move 1), and the column-wheel pointer 
comes to rest in sector 2 (player C makes move 2), then player C must pay player R the sum of $5. Some of the entries 
in this table are negative, indicating that player C makes a negative payment to player R. By this we mean that player R 
makes a positive payment to player C. For example, if the row-wheel shows 2 and the column-wheel shows 4, then 
player R pays player C the sum of $4, because the corresponding entry in the table is —$4. In this way the positive entries 
of the table are the gains of player Ё and the losses of player C, and the negative entries are the gains of player C and the 
losses of player R. 


In this game the players have no control over their moves; each move is determined by chance. However, if each player 
can decide whether he or she wants to play, then each would want to know how much he or she can expect to win or lose 
over the long term if he or she chooses to play. (Later in the section we will discuss this question and also consider a 
more complicated situation in which the players can exercise some control over their moves by varying the sectors of 
their wheels.) 


Two-Person Zero-Sum Matrix Games 


The game described above is an example of a two-person zero-sum matrix game. The term zero-sum means that in each 
play of the game, the positive gain of one player 1s equal to the negative gain (loss) of the other player. That 1s, the sum 
of the two gains is zero. The term matrix game 1s used to describe a two-person game in which each player has only a 
finite number of moves, so that all possible outcomes of each play, and the corresponding gains of the players, can be 
displayed in tabular or matrix form, as in Table 1. 


In a general game of this type, let player R have m possible moves and let player C have n possible moves. In a play of 
the game, each player makes one of his or her possible moves, and then a payoff is made from player C to player R, 
depending on the moves. For i = 1, 2, ..., #, and j = 1, 2, ..., а, let us set 
а; = payoff that player C makes to player Rif player А 
makes move i and player C makes move j 


This payoff need not be money; it can be any type of commodity to which we can attach a numerical value. As before, if 
an entry ij is negative, we mean that player C receives a payoff of |@;;| from player R. We arrange these mn possible 
payoffs in the form of an js x y matrix 


G1] 412 ... 41и 

a a .. @ 
A= - 22 i 

Gy] Cwm2 --- Omn 


which we will call the payoff matrix of the game. 


Each player is to make his or her moves on a probabilistic basis. For example, for the game discussed in the 


introduction, the ratio of the area of a sector to the area of the wheel would be the probability that the player makes the 
move corresponding to that sector. Thus, from Figure 10.7.1, we see that player R would make move 2 with probability 


Ъ and player C would make move 2 with probability } In the general case we make the following definitions: 


Py = probability that player & makes move i (i= 1, 2,..., m) 
9; = probability that player C makes move j (j= 1, 2,..., м) 
It follows from these definitions that 
Р+Р2+ `- ‘Фра 1 
апа 
41+42+ ^ 24-1 
With the probabilities P; and 4; we form two vectors: 
91 
р= [21 P2 --- Рм] and q= йз 
Gn 


We call the row vector p the strategy of player R and the column vector q the strategy of player C. For example, from 
Figure 10.7.1 we have 


for the carnival game described earlier. 


From the theory of probability, if the probability that player R makes move i is Pi, and independently the probability that 
player C makes move j is fj, then 2:97 is the probability that for any one play of the game, player R makes move i and 
player C makes move j. The payoff to player R for such a pair of moves is ij. If we multiply each possible payoff by its 
corresponding probability and sum over all possible payoffs, we obtain the expression 


2112191 t 2122192 F -F G1nP1Qy» + @21Р2@1 +... GP mn (1) 


Equation 1 is a weighted average of the payoffs to player R; each payoff is weighted according to the probability of its 
occurrence. In the theory of probability, this weighted average is called the expected payoff to player К. It can be shown 
that if the game is played many times, the long-term average payoff per play to player Ё 15 given by this expression. We 
denote this expected payoff by E(p, q) to emphasize the fact that it depends on the strategies of the two players. From 
the definition of the payoff matrix А and the strategies p and а, it can be verified that we may express the expected 
payoff in matrix notation as 


41] 012 ... d|w || $1 
421 422 ... 42» || 92 

Е(р, 9) = [рі P2 --- Pml| . | | | |=рАфч (2) 
Gm] m2 --- Emn || ди 


Because #(р, q) is the expected payoff to player R, it follows that —E(p, q) is the expected payoff to player C. 


EXAMPLE 1 Expected Payoff to Player R «4 


For the carnival game described earlier, we have 


1 
4 
1 1 1 d 5 2-141 13 
EQ.) =рАа= |1 2 Hd -2 4 =3 -—4 1 = 25: = 1805... 
6 -5 0 3||5 
3 
1 
6 


Thus, in the long run, player Ё can expect to receive an average of about 18 cents from player C in each 
play of the game. 


So far we have been discussing the situation in which each player has a predetermined strategy. We will now consider 
the more difficult situation in which both players can change their strategies independently. For example, in the game 
described in the introduction, we would allow both players to alter the areas of the sectors of their wheels and thereby 
control the probabilities of their respective moves. This qualitatively changes the nature of the problem and puts us 
firmly in the field of true game theory. It is understood that neither player knows what strategy the other will choose. It 
is also assumed that each player will make the best possible choice of strategy and that the other player knows this. 
Thus, player Ё attempts to choose a strategy p such that E(p, q) is as large as possible for the best strategy q that player 
C can choose; and similarly, player C attempts to choose a strategy q such that E(p, q) is as small as possible for the 
best strategy p that player R can choose. To see that such choices are actually possible, we will need the following 
theorem, called the Fundamental Theorem of Two-Person Zero-Sum Games. (The general proof, which involves ideas 
from the theory of linear programming, will be omitted. However, below we will prove this theorem for what are called 
strictly determined games and 2 x 2 matrix games.) 


THEOREM 10.7.1 Fundamental Theorem of Zero-Sum Games 


There exist strategies p and q such that 


Е(р", 4) >E”, g) E(p.q) (3) 


for all strategies p and q. 


The strategies p and q in this theorem are the best possible strategies for players R and C, respectively. To see why 


this is so, let v = Е(р', а). The left-hand inequality of Equation 3 then reads 
E(p', 9) >v for all strategies q 


This means that if player R chooses the strategy p *, then no matter what strategy q player C chooses, the expected 


payoff to player R will never be below v. Moreover, it is not possible for player R to achieve an expected payoff greater 
than v. To see why, suppose there is some strategy р that player R can choose such that 


E(p ,q)»v for all strategies q 


Then, in particular, 
жж ж 
Ep .q)-v 
But this contradicts the right-hand inequality of Equation 3, which requires that v > E(p s а). Consequently, the best 


player Ё can do is prevent his or her expected payoff from falling below the value v. Similarly, the best player C can do 
is ensure that player R's expected payoff does not exceed v, and this can be achieved by using strategy а> 


On the basis of this discussion, we arrive at the following definitions. 


DEFINITION 1 


If p and а are strategies such that 


Е(р", Q >E”, а) E(p.q) (4) 


for all strategies p and q, then 
(i) p " is called an optimal strategy for player R. 
(ii) q is called an optimal strategy for player C. 


(iii) y = E(p , а) is called the value of the game. 


The wording in this definition suggests that optimal strategies are not necessarily unique. This 1s indeed the case, and in 
Exercise 2 we ask you to show this. However, it can be proved that any two sets of optimal strategies always result in 
the same value v ofthe game. That is, ifp ,q апар ,q are optimal strategies, then 


ж ж Жжж жж 
E(p.q)—EZ(p .q ) (5) 
The value of a game is thus the expected payoff to player R when both players choose any possible optimal strategies. 


To find optimal strategies, we must find vectors p " and q that satisfy Equation 4. This is generally done by using linear 


programming techniques. Next, we discuss special cases for which optimal strategies may be found by more elementary 
techniques. 


We now introduce the following definition. 


DEFINITION 2 


An entry d»; in a payoff matrix A is called a saddle point if 
(i) @ys is the smallest entry in its row, and 
(ii) 2»; is the largest entry in its column. 


A game whose payoff matrix has a saddle point is called strictly determined. 


For example, the shaded element in each of the following payoff matrices is a saddle point: 


з Ш 30 —50 —5 15 —8 —2 10 
xal 60 90 75|, : 
-4 —10 60 —30 7 10 6 9 
6 11 -3 2 


If a matrix has a saddle point &»;, it turns out that the following strategies are optimal strategies for the two players: 


0 
0 

р=[0 S. 1... @]. а =| 1 | — sth entry 
rih entey f 
0 


That is, an optimal strategy for player R is to always make the rth move, and an optimal strategy for player C is to 
always make the sth move. Such strategies for which only one move is possible are called pure strategies. Strategies for 
which more than one move is possible are called mixed strategies. To show that the above pure strategies are optimal, 
you can verify the following three equations (see Exercise 6): 


Е(р", q") =p Aq =а„ (6) 
Е(р*. ч) =p*Aq>a,, for any strategy q (7) 
Е(р. 9%) =р49* < а,; for any strategy P (8) 


Together, these three equations imply that 
E(p.q)-Z(p.q)-EZ(p.q) 


for all strategies p and ч. Because this is exactly Equation 4, it follows that p and а are optimal strategies. 


From Equation 6 the value of a strictly determined game is simply the numerical value of a saddle point &rs. It is 
possible for a payoff matrix to have several saddle points, but then the uniqueness of the value of a game guarantees that 
the numerical values of all saddle points are the same. 


EXAMPLE 2 Optimal Strategies to Maximize a Viewing Audience + 


Two competing television networks, Ё and C, are scheduling one-hour programs in the same time period. 
Network R can schedule one of three possible programs, and network C can schedule one of four possible 
programs. Neither network knows which program the other will schedule. Both networks ask the same 
outside polling agency to give them an estimate of how all possible pairings of the programs will divide the 
viewing audience. The agency gives them each Table 2, whose (i, j)-th entry is the percentage of the 
viewing audience that will watch network R if network R's program i is paired against network C's program 
j. What program should each network schedule in order to maximize its viewing audience? 


Table 2 


Network C's 
Program 


pt fats |e | 
BOE 
Program 


Solution Subtract 50 from each entry in Table 2 to construct the following matrix: 
10 -30 —20 5 
0 25 —5 10 
20 -5 -15 —20 


This is the payoff matrix of the two-person zero-sum game in which each network is considered to start 
with 50% of the audience, and the (1, j)-th entry of the matrix is the percentage of the viewing audience 
that network C loses to network R if programs i and j are paired against each other. It is easy to see that the 
entry 

423— —5 
is a saddle point of the payoff matrix. Hence, the optimal strategy of network R is to schedule program 2, 
and the optimal strategy of network C is to schedule program 3. This will result in network R's receiving 
45% of the audience and network C's receiving 55% of the audience. 


2 x 2 Matrix Games 


Another case in which the optimal strategies can be found by elementary means occurs when each player has only two 
possible moves. In this case, the payoff matrix is a 2 x 2 matrix 


211 412 
А= 
Е 2 
If the game is strictly determined, at least one of the four entries of А 1s a saddle point, and the techniques discussed 


above can then be applied to determine optimal strategies for the two players. If the game is not strictly determined, we 
first compute the expected payoff for arbitrary strategies p and а: 


Е(р, 4) =р4 = [РІ P2] E ese 


421 d22||d2 (9) 
= @11P1¢1 + 212Р192 + 82127201 + 8227292 
Because 
Pitp2=1 and 41 +92 = 1 (10) 


we may substitute p3 = 1 — p, and g3 = 1 — 9 into 9 to obtain 


E(p, q) = 2112191 + a12721(1 21) Haal — pigi + a22(1 — pi) — 21) (11) 


If we rearrange the terms in Equation 11, we can write 
E(p. q) = [(a11 +422 — 412 — 221) 21 — (422 — a32))]a1 + (412 4922221 + а22 (12) 


By examining the coefficient of the 21 term in 12, we see that if we set 


—» — — 4-—u)903]  -^-—. 
P1= Pl = р Fan ара (13) 
then that coefficient is zero, and 12 reduces to 
* _ 411422 — 01202] 
E(p,q)— (14) 


211 422 = 212 = 221 


Equation 14 is independent of 9; that is, if player R chooses the strategy determined by 13, player C cannot change the 
expected payoff by varying his or her strategy. 


In a similar manner, it can be verified that 1f player C chooses the strategy determined by 


L4. OD m. 
21741 = ар жад ар а (15) 
then substituting in 12 gives 
* aia = 02170 
Elp, 4°) = —2112222—212221.— 
Pa) ай11++@22—@12—@2| (16) 
Equations 14 and 16 show that 
* ж ж ж 
E(p , 9) = E(p q ) = E(p.q) (17) 


for all strategies p апа q. Thus, the strategies determined by 13, 15, апа 10 are optimal strategies for players А and С, 
respectively, and so we have the following result. 


THEOREM 10.7.2 Optimal Strategies for a 2 x 2 Matrix Game 


For a 2 x 2 game that is not strictly determined, optimal strategies for players R and C are 
ж 222 — 2] а=] 
Б | а +42274127821 411 +42 - 412-421 | 
апа 
422 —d]2 
* а +4227 412-421 
41] — 434 
ар +422 = 212 — 421 
The value of the game is 


= 211222 — 41222] 
а 422—212 221 


In order to be complete, we must show that the entries in the vectors р " and а are numbers strictly between 0 and 1. In 


Exercise 8 we ask you to show that this is the case as long as the game is not strictly determined. 


Equation 17 is interesting in that it implies that either player can force the expected payoff to be the value of the game 
by choosing his or her optimal strategy, regardless of which strategy the other player chooses. This is not true, in 
general, for games in which either player has more than two moves. 


EXAMPLE 3 Using Theorem 10.7.2. «4 


The federal government desires to inoculate its citizens against a certain flu virus. The virus has two 
strains, and the proportions in which the two strains occur in the virus population is not known. Two 
vaccines have been developed and each citizen is given only one of them. Vaccine 1 is 85% effective 
against strain 1 and 70% effective against strain 2. Vaccine 2 is 6096 effective against strain 1 and 9096 
effective against strain 2. What inoculation policy should the government adopt? 


Solution We can consider this a two-person game in which player R (the government) desires to make 
the payoff (the fraction of citizens resistant to the virus) as large as possible, and player C (the virus) 
desires to make the payoff as small as possible. The payoff matrix is 


Strain 
12 
Vase [э] 
This matrix has no saddle points, so Theorem 10.7.2 15 applicable. Consequently, 
— 472 — 2] = 30 — 60 230 . 2 
i aji Faz ајр ап 854 .90—.70—.60 45. 3 
P = 1-pp-1-$-1 
A 422 — 412 2 90 — 70 ..20 4 
^ aji Жад ајр ап“ 854 .90—.70—.60 45 9 
E 41222 — 4222] .,685)(90) — (70) (60) 345 _ 7666 
а +422 — 212 = 221 .85 + .90 — 70 — .60 45 7 i 
2 1 


Thus, the optimal strategy for the government is to inoculate 3 of the citizens with vaccine 1 and 3 of the 


citizens with vaccine 2. This will guarantee that about 76.7% of the citizens will be resistant to a virus 
attack regardless of the distribution of the two strains. 


E 5 


In contrast, a virus distribution of 9 of strain 1 and 5 of strain 2 will result in the same 76.7% of resistant 


citizens, regardless of the inoculation strategy adopted by the government (see Exercise 7). 


Exercise Set 10.7 


1. Suppose that a game has a payoff matrix 


(a) If players R and C use strategies 


fle Ble Ble Blo 


respectively, what is the expected payoff of the game? 


(b) If player C keeps his strategy fixed as in part (a), what strategy should player R choose to maximize his expected 
payoff? 


(c) If player R keeps her strategy fixed as in part (a), what strategy should player C choose to minimize the expected 
payoff to player R? 


Answer: 


(а) —5/8 
(b) [9 1 0] 
(c) [1 0 0 0]? 


2. Construct a simple example to show that optimal strategies are not necessarily unique. For example, find a payoff 
matrix with several equal saddle points. 


Answer: 


1 
1 


3. For the strictly determined games with the following payoff matrices, find optimal strategies for the two players, and 
find the values of the games. 


Let A = l | | for example. 


(а) |5 2 
7 3 
(b |-5 =2 
2 
—4 1 
(Q| 2-2 0 
=É 0 —5 
5 2 3 
(d) | =3 2 -l 
—2 =] 5 
-4 1 0 
-3 4 6 
Answer: 


(а) p* — [0 1], = || y=3 


A 


л 


. For the 2 x 2 games with the following payoff matrices, find optimal strategies for the two players, and find the 
values of the games. 


(a) 6 3 
-1 4 

(b) | 40 20 
—10 30 


(a) 


ajo ылу bs mith ale oj cal 


ale 
scs ЛО ре иЗ — _29 
Р =[5 5| Vp t — 13 
13 


. Player R has two playing cards: a black ace and a red four. Player C also has two cards: a black two and a red three. 
Each player secretly selects one of his or her cards. If both selected cards are the same color, player C pays player R 
the sum of the face values in dollars. If the cards are different colors, player Ё pays player C the sum of the face 
values. What are optimal strategies for both players, and what is the value of the game? 


Answer: 


11 
[BB] eZl- 
20 
6. Verify Equations 6, 7, and 8. 
7. Verify the statement in the last paragraph of Example 3. 
8. Show that the entries of the optimal strategies p and q given in Theorem 10.7.2 are numbers strictly between zero 


and one. 
Section 10.7 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific 
calculator with some linear algebra capabilities. For each exercise you will need to read the relevant documentation for 
the particular utility you are using. The goal of these exercises is to provide you with a basic proficiency with your 
technology utility. Once you have mastered the techniques in these exercises, you will be able to use your technology 
utility to solve many of the problems in the regular exercise sets. 


T1. Consider a game between two players where each player can make up to n different moves (ж > 1). If the ith move 
of player R and the jth move of player C are such that i + j is even, then C pays R $1. If i + ў is odd, then R pays C $1. 
Assume that both players have the same strategy—that is, py = [5;] „у and qj, = [pj] nx] where 

p1 + 92 + рз +... +F ру = 1. Use a computer to show that 


E(p2,42) = (р –02)° 

Е(рз, qi) = (ру -P2 +3)" 

Е(ра, q4) = (P1 — 2 + p3 — p4)? 

E(ps.qs) = (р -P2 +P3— p4 + p3)? 
Using these results as a guide, prove in general that the expected payoff to player R is 


n | 2 
Е(р,, Чы) = р - ШЕЛ 20 


which shows that in the long run, player R will not lose in this game. 

T2. Consider a game between two players where each player can make up to n different moves (м > 1). If both players 
make the same move, then player C pays player R $ (» — 1). However, if both players make different moves, then 
player R pays player C $1. Assume that both players have the same strategy—that is, py = [pi] jx», and qs = [0;] 4.1: 
where р + p3 + рз +... + ду = 1. Use a computer to show that 


Е(рз, q2) = 


Е(рз, q3) = 


Е(рд, 94) = 


in — р)? + itn = рз)? + ii – р)? 
+E- 92)" 

Lep =p)? + tp - 92)? + 4001 — рз)? 

2 2 2 

+5 (62 -p) + ii =p)? + i — рз)? 
+5 (63 -р\)* + is =p)? + 5 (63 — рз)? 
i — 9)? + in = р2)2+ i – рз)? 
Hs = ра)? + ii — р)? + iG - рз)? 
+1002 —03)2 + Lpr- pa)? + 1(оз— p1)? 
+з = р2)2 + Тоз – рз) + is —pa)? 
+5 (04— 01)" + 14-92? + 1(оа— рз)? 


«a — ра)? 


Using these results as a guide, prove in general that the expected payoff to player R is 


13 2 
EPn чы) = 2l » (p; — pj)” z 0 


= = 


which shows that in the long run, player Ё will not lose in this game. 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


10.8 Leontief Economic Models 


In this section we discuss two linear models for economic systems. Some results about nonnegative matrices are applied to determine 
equilibrium price structures and outputs necessary to satisfy demand. 


Prerequisites 


Linear Systems 
Matrices 


Economic Systems 


Matrix theory has been very successful in describing the interrelations among prices, outputs, and demands in economic systems. In 
this section we discuss some simple models based on the ideas of Nobel laureate Wassily Leontief. We examine two different but 
related models: the closed or input-output model, and the open or production model. In each, we are given certain economic 
parameters that describe the interrelations between the “industries” in the economy under consideration. Using matrix theory, we then 
evaluate certain other parameters, such as prices or output levels, in order to satisfy a desired economic objective. We begin with the 
closed model. 


Leontief Closed (Input-Output) Model 
First we present a simple example; then we proceed to the general theory of the model. 
EXAMPLE 1 An Input-Output Model + 


Three homeowners—a carpenter, an electrician, and a plumber—agree to make repairs in their three homes. They agree 
to work a total of 10 days each according to the following schedule: 


Work Performed һу 


[сине in | eme 
Days of Work in Home of Electrician а fos fa) 
For tax purposes, they must report and pay each other a reasonable daily wage, even for the work each does on his or her 
own home. Their normal daily wages are about $100, but they agree to adjust their respective daily wages so that each 


homeowner will come out even—that is, so that the total amount paid out by each is the same as the total amount each 
receives. We can set 


Days of Work in Home of Plumber 


рі = daily wage of carpenter 
pa = daily wage of electrician 
рз = daily wage of plumber 


To satisfy the “equilibrium” condition that each homeowner comes out even, we require that 
total expenditures = total income 


for each of the homeowners for the 10-day period. For example, the carpenter pays a total of 2p + рэ + 6p3 for the 


repairs in his own home and receives a total income of 10p for the repairs that he performs on all three homes. 
Equating these two expressions then gives the first of the following three equations: 


2p + p2 + брз = 10р| 
4pi + 5р; + рз = 10р; 
4рү + 4р; + 3p3 = 10рз 


The remaining two equations are the equilibrium equations for the electrician and the plumber. Dividing these equations 
by 10 and rewriting them in matrix form yields 


5; Р2 |= |Р2 (1) 
4 4 3 || РЗ Рз 
Equation 1 can be rewritten as а homogeneous system by subtracting the left side from the right side to obtain 
8 —1 =—6]) 71 0 
=4 5 —-1]|?22|2|0 
=4 -4  7]||P3 0 


The solution of this homogeneous system is found to be (verify) 


Pl 31 
Р2 |= 5| 32 
РЗ 36 


where s is an arbitrary constant. This constant is a scale factor, which the homeowners may choose for their 
convenience. For example, they can set = — 3 so that the corresponding daily wages—$93, $96, and $108—are about 
$100. 


This example illustrates the salient features of the Leontief input-output model of a closed economy. In the basic Equation 1, each 
column sum of the coefficient matrix is 1, corresponding to the fact that each of the homeowners' “output” of labor is completely 
distributed among these same homeowners in the proportions given by the entries in the column. Our problem is to determine suitable 
“prices” for these outputs so as to put the system in equilibrium—that is, so that each homeowner's total expenditures equal his or her 
total income. 


In the general model we have an economic system consisting of a finite number of "industries," which we number as industries 

1, 2, ..., Æ. Over some fixed period of time, each industry produces an “output” of some good or service that is completely utilized in a 
predetermined manner by the k industries. An important problem is to find suitable “prices” to be charged for these k outputs so that 
for each industry, total expenditures equal total income. Such a price structure represents an equilibrium position for the economy. 


For the fixed time period in question, let us set 
ру price charged by the ith industry for its total output 
ejj = fraction of the total output of the jth industry purchased by the ith industry 


fori, j = 1, 2, ..., &. By definition, we have 


© pi 29, i= 1/2, 00% 
ш) ej > 0, i, j=1,2,...% 
(ш) ej; Hez +... Бе = 1, PH 1, Zast 
With these quantities, we form the price vector 
P1 
P2 
P = 
Pk 


and the exchange matrix or input-output matrix 


ё] 012 --- ёк 
221 022 --- 2k 


ekl €k2 --- kk 


Condition (iii) expresses the fact that all the column sums of the exchange matrix are 1. 


As in the example, in order that the expenditures of each industry be equal to its income, the following matrix equation must be 
satisfied [see 1]: 


©! 
] 
~ 


(2) 
or 

(—Ep-0 (3) 
Equation 3 is a homogeneous linear system for the price vector p. It will have a nontrivial solution if and only if the determinant of its 
coefficient matrix 7 — # is zero. In Exercise 7 we ask you to show that this is the case for any exchange matrix E. Thus, 3 always has 
nontrivial solutions for the price vector p. 
Actually, for our economic model to make sense, we need more than just the fact that 3 has nontrivial solutions for p. We also need the 
prices Р; ofthe k outputs to be nonnegative numbers. We express this condition as p > 0. (In general, if A is any vector or matrix, the 
notation 4 > () means that every entry of A is nonnegative, and the notation А = Q means that every entry of A is positive. Similarly, 


А > B means A — B > 0), and 4 > B means 4 — 8 > 0).) To show that 3 has a nontrivial solution for which p > 0 is a bit more difficult 
than showing merely that some nontrivial solution exists. But it is true, and we state this fact without proof in the following theorem. 


THEOREM 10.8.1 


If E is an exchange matrix, then Ep — p always has a nontrivial solution p whose entries are nonnegative. 


Let us consider a few simple examples of this theorem. 
EXAMPLE 2 Using Theorem 10.8.1 << 


Let 


Then (7 — E)p = 0 is 


which has the general solution 


where s is an arbitrary constant. We then have nontrivial solutions p > 0 for any s > Q. 


EXAMPLE 3 Using Theorem 10.8.1 «4 


Let 


Then (7 — E)p = 0 has the general solution 


41-4 


where s and t are independent arbitrary constants. Nontrivial solutions p > 0 then result from any s > 0 and £ > 0, not 
both zero. 


Example 2 indicates that in some situations one of the prices must be zero in order to satisfy the equilibrium condition. Example 3 
indicates that there may be several linearly independent price structures available. Neither of these situations describes a truly 
interdependent economic structure. The following theorem gives sufficient conditions for both cases to be excluded. 


THEOREM 10.8.2 


Let E be an exchange matrix such that for some positive integer m all the entries of #™ are positive. Then there is exactly one 
linearly independent solution of (7 — E)p = 0, and it may be chosen so that all its entries are positive. 


We will not give a proof of this theorem. If you have read Section 10.5 on Markov chains, observe that this theorem is essentially the 
same as Theorem 10.5.4. What we are calling exchange matrices in this section were called stochastic or Markov matrices in Section 
10.5. 


EXAMPLE 4 Using Theorem 10.8.2 <4 


The exchange matrix in Example 1 was 


Because # > Q, the condition #™ = 0 in Theorem 10.8.2 is satisfied for p — 1. Consequently, we are guaranteed that 
there is exactly one linearly independent solution of (7 — #)р = 0, and it can be chosen so that p > 0. In that example, 
we found that 
31 
p=| 32 
36 


is such a solution. 


Leontief Open (Production) Model 


In contrast with the closed model, in which the outputs of k industries are distributed only among themselves, the open model attempts 
to satisfy an outside demand for the outputs. Portions of these outputs can still be distributed among the industries themselves, to keep 
them operating, but there is to be some excess, some net production, with which to satisfy the outside demand. In the closed model the 
outputs of the industries are fixed, and our objective is to determine prices for these outputs so that the equilibrium condition, that 
expenditures equal incomes, is satisfied. In the open model it is the prices that are fixed, and our objective is to determine levels of the 
outputs of the industries needed to satisfy the outside demand. We will measure the levels of the outputs in terms of their economic 
values using the fixed prices. To be precise, over some fixed period of time, let 


x; = monetary value of the total output of the ith industry 
dj = monetary value of the output of the ith industry needed to satisfy the outside demand 
cij = monetary value of the output of the ith industry needed by the jth industry to produce one unit of monetary value of its own output 


With these quantities, we define the production vector 


X1 
х2 
х= | | 
Xk 
the demand vector 
di 
d 
a=|°? 
dk 
and the consumption matrix 
C11 C12 --. Cik 
c21 C22 --- CI 
Cki Ck2 Ckk 
By their nature, we have that 
xz, d 0, and CIO 
From the definition of ^ij and 7j, it can be seen that the quantity 
CX] Ci2X2 +... CikX k 


is the value of the output of the ith industry needed by all k industries to produce a total output specified by the production vector x. 
Because this quantity is simply the ith entry of the column vector (7x, we can say further that the ith entry of the column vector 
x— Сх 
is the value of the excess output of the ith industry available to satisfy the outside demand. The value of the outside demand for the 
output of the ith industry is the ith entry of the demand vector d. Consequently, we are led to the following equation 
x-Cx=d 


or 
(i-C)x=d (4) 


for the demand to be exactly met, without any surpluses or shortages. Thus, given C and d, our objective is to find a production vector 
x > 0 that satisfies Equation 4. 


EXAMPLE 5 Production Vector for a Town + 


A town has three main industries: a coal-mining operation, an electric power-generating plant, and a local railroad. To 
mine $1 of coal, the mining operation must purchase $.25 of electricity to run its equipment and $.25 of transportation 
for its shipping needs. To produce $1 of electricity, the generating plant requires $.65 of coal for fuel, $.05 of its own 
electricity to run auxiliary equipment, and $.05 of transportation. To provide $1 of transportation, the railroad requires 
$.55 of coal for fuel and $.10 of electricity for its auxiliary equipment. In a certain week the coal-mining operation 
receives orders for $50,000 of coal from outside the town, and the generating plant receives orders for $25,000 of 
electricity from outside. There is no outside demand for the local railroad. How much must each of the three industries 
produce in that week to exactly satisfy their own demand and the outside demand? 


Solution For the one-week period let 
х1 = value of total output of coal-mining operation 
x3 = value of total output of power-generating plant 
хз = value of total output of local railroad 


From the information supplied, the consumption matrix of the system is 


0 .65 .55 
C-—|.25 .05 10 
25 05 0 


The linear system (/ — C)x = d is then 
100 —65 —55][x;] | 59, 000 
—.25 95 —.10 || 72 | = | 25, 000 
—25 —05 1.00 || *3 0 
The coefficient matrix on the left is invertible, and the solution is given by 
756 542 470] 50,000 | | 102, 087 
х= (1-С) 14= d 220 690 190 || 25,000 |=| 56, 163 
200 170 630 0 28, 330 


Thus, the total output of the coal-mining operation should be $102,087, the total output of the power-generating plant 
should be $56,163, and the total output of the railroad should be $28,330. 


Let us reconsider Equation 4: 
(i-C)x=d 


If the square matrix 7 — (7 is invertible, we can write 
x—(—-C) d (5) 


In addition, if the matrix (7 — С) 1 has only nonnegative entries, then we are guaranteed that for any d > 0, Equation 5 has a unique 


nonnegative solution for x. This is a particularly desirable situation, as it means that any outside demand can be met. The terminology 
used to describe this case is given in the following definition. 


DEFINITION 1 


A consumption matrix C is said to be productive if (7 — C^) — exists and 


(1-С) >0 


We will now consider some simple criteria that guarantee that a consumption matrix is productive. The first is given in the following 
theorem. 


THEOREM 10.8.3 Productive Consumption Matrix 


A consumption matrix C is productive if and only if there is some production vector x > 0 such that x > Су. 


(The proof is outlined in Exercise 9.) The condition x = Cy means that there is some production schedule possible such that each 
industry produces more than it consumes. 


Theorem 10.8.3 has two interesting corollaries. Suppose that all the row sums of C are less than 1. If 


1 
1 


х= 


then (7x is a column vector whose entries are these row sums. Therefore, x => (7x, and the condition of Theorem 10.8.3 is satisfied. 
Thus, we arrive at the following corollary: 


COROLLARY 10.8.4 


A consumption matrix is productive if each of its row sums is less than 1. 


As we ask you to show in Exercise 8, this corollary leads to the following: 


COROLLARY 10.8.5 


A consumption matrix is productive if each of its column sums is less than 1. 


Recalling the definition of the entries of the consumption matrix C, we see that the jth column sum of C is the total value of the outputs 
of all k industries needed to produce one unit of value of output of the jth industry. The jth industry is thus said to be profitable if that 
jth column sum is less than 1. In other words, Corollary 10.8.5 says that a consumption matrix is productive if all k industries in the 
economic system are profitable. 


EXAMPLE 6 Using Corollary 10.8.5 + 


The consumption matrix in Example 5 was 


0 .65 .55 
С=|.25 .05 .10 
25 05 0 


All three column sums in this matrix are less than 1, so all three industries are profitable. Consequently, by Corollary 
10.8.5, the consumption matrix C is productive. This can also be seen in the calculations in Example 5, as (7 — С) lis 


nonnegative. 


Exercise Set 10.8 


1. For the following exchange matrices, find nonnegative price vectors that satisfy the equilibrium condition 3. 


(а) [1 1 
2 3 
1 2 
2 3 

(b) | 1 1 
202 
1,1 
a 
1 
7 10 

(с) [.35 .50 .30 
25 .20 .30 
40 .30 40 


Answer: 


(а) | 
(b) | 


(c) | 78 


. Using Theorem 10.8.3 and its corollaries, show that each of the following consumption matrices is productive. 


(a) |.8 .1 
3 6 


(b) |.70 .30 .25 


.20 .40 .25 

.05 .15 .25 
(су .7 .3 2 

143 

2 4 1 
Answer: 


(a) Use Corollary 10.8.4; all row sums are less than one. 

(b) Use Corollary 10.8.5; all column sums are less than one. 

(c) 2 1:3 

Use Theorem 10.8.3, with x= | 1 | = Cx—| .9 |. 
1 9 


. Using Theorem 10.8.2, show that there is only one linearly independent price vector for the closed economic system with exchange 
matrix 


An № 
© Un L^ 


Answer: 


E^ has all positive entries. 


. Three neighbors have backyard vegetable gardens. Neighbor А grows tomatoes, neighbor В grows corn, and neighbor C grows 
lettuce. They agree to divide their crops among themselves as follows: A gets 1 of the tomatoes, 1 of the corn, and 1 of the 


lettuce. B gets i of the tomatoes, i of the corn, and i of the lettuce. C gets ł of the tomatoes, i of the corn, i of the lettuce. 


What prices should the neighbors assign to their respective crops if the equilibrium condition of a closed economy is to be satisfied, 
and if the lowest-priced crop is to have a price of $100? 


Answer: 


Price of tomatoes, $120.00; price of corn, $100.00; price of lettuce, $106.67 


. Three engineers—a civil engineer (CE), an electrical engineer (EE), and a mechanical engineer (ME)—each have a consulting firm. 
The consulting they do is of a multidisciplinary nature, so they buy a portion of each others' services. For each $1 of consulting the 
CE does, she buys $.10 of the EE's services and $.30 of the ME's services. For each $1 of consulting the EE does, she buys $.20 of 
the CE's services and $.40 of the ME's services. And for each $1 of consulting the ME does, she buys $.30 of the CE's services and 
$.40 of the EE's services. In a certain week the CE receives outside consulting orders of $500, the EE receives outside consulting 
orders of $700, and the ME receives outside consulting orders of $600. What dollar amount of consulting does each engineer 
perform in that week? 


Answer: 


$1256 for the CE, $1448 for the EE, $1556 for the ME 


6. (a) Suppose that the demand g; for the output of the ith industry increases by one unit. Explain why the ith column of the matrix 
(=C) 1 is the increase that must be made to the production vector x to satisfy this additional demand. 


(b) Referring to Example 5, use the result in part (a) to determine the increase in the value of the output of the coal-mining 
operation needed to satisfy a demand of one additional unit in the value of the output of the power-generating plant. 


Answer: 
(b) 242 
503 


7. Using the fact that the column sums of an exchange matrix E are all 1, show that the column sums of 7 — # are zero. From this, 
show that 7 — # has zero determinant, and so (/ — E)p = 0 has nontrivial solutions for p. 


8. Show that Corollary 10.8.5 follows from Corollary 10.8.4. 
[Hint: Use the fact that (A f = =(A = J for any invertible matrix А.] 


9. (Calculus required) Prove Theorem 10.8.3 as follows: 


(a) Prove the “only if” part of the theorem; that is, show that if C is a productive consumption matrix, then there is a vector x > 0 
such that x > Cx. 


(b) Prove the “if” part of the theorem as follows: 
Step 1 Show that if there is a vector х > 0 such that Cy” < x“, then x" > 9. 
Step 2 Show that there is a number А such that 0) < Д < 1 and Cx" = jy”. 
Step 3 Show that "x = \"x"* fora = 1, 2,.... 
Step 4 Show that C" — 0 ass — co. 
Step 5 By multiplying out, show that 


Сута CO, Ст)" 
for» = 1,2,... 
Step 6 By letting у — ao in Step 5, show that the matrix infinite sum 
S=1+C4+C7 +... 


exists and that (7 = C)S=/. 
Step 7 Show that S7 0 and that $= (7 — C) 1. 


Step 8 Show that C is a productive consumption matrix. 


Section 10.8 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, Maple, 
Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra 
capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are using. The goal of 
these exercises is to provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. Consider a sequence of exchange matrices { E5, Es, Ёд, E5, .. En}, where 


LaL 
gd 023 
2 1 
gos | Ез=|1 0 1, 
11 $ 
2 QUA 
2.5 
о ЧЕ ЧЕ Че! 
ОЕ Ж 2345 
23 4 ro 111 
11 3 4 5 
Ж 1 174 
рот 872? 15 
AAE 1 1 
oo to 1 
icd 3 5 
o0 11 
3 4 ooo 11 
4 5 


and so on. Use a computer to show that E > 0, ЕЗ > 03, Е} > O04 Е? > 05, and make the conjecture that although En > Oy is true, 
ЕК > 0, is not true fork = 1, 2, 3, ..., 4 — 1. Next, use a computer to determine the vectors p» such that #„р„ = py (for » = 2, 3, 4, 
5, 6), and then see if you can discover a pattern that would allow you to compute Py+1 easily from ри. Test your discovery by first 
constructing Pg from 
2520 
3360 
1890 
рт = | 672 
175 
36 
7 


and then checking to see whether gps = ps. 
T2. Consider an open production model having п industries with у; = 1. In order to produce $1 of its own output, the jth industry must 
spend $(1 / м) for the output of the ith industry (for all i # j), but the jth industry (for all у = 1, 2, 3, ..., м) spends nothing for its own 
output. Construct the consumption matrix Cy, show that it is productive, and determine an expression for (/, — Cp) in 
determining an expression for (7, — Cp) -1, use a computer to study the cases when » = 2, 3, 4, and 5; then make a conjecture and 
prove your conjecture to be true. [Hint: If Fy, = [1] px» (i.e. the » x » matrix with every entry equal to 1), first show that 

F Z =n, 


and then express your value of (7, — Cp) 7l in terms of n, Ту, and F,,.] 
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10.9 Forest Management 


In this section we discuss a matrix model for the management of a forest where trees are grouped into classes according to height. 
The optimal sustainable yield of a periodic harvest is calculated when the trees of different height classes can have different 


economic values. 


Prerequisites 


Matrix Operations 


Optimal Sustainable Yield 


Our objective is to introduce a simplified model for the sustainable harvesting of a forest whose trees are classified by height. The 
height of a tree is assumed to determine its economic value when it is cut down and sold. Initially, there is a distribution of trees 
of various heights. The forest is then allowed to grow for a certain period of time, after which some of the trees of various heights 
are harvested. The trees left unharvested are to be of the same height configuration as the original forest, so that the harvest is 
sustainable. As we will see, there are many such sustainable harvesting procedures. We want to find one for which the total 
economic value of all the trees removed is as large as possible. This determines the optimal sustainable yield of the forest and is 
the largest yield that can be attained continually without depleting the forest. 


The Model 


Suppose that a harvester has a forest of Douglas fir trees that are to be sold as Christmas trees year after year. Every December 
the harvester cuts down some of the trees to be sold. For each tree cut down, a seedling is planted in its place. In this way the total 
number of trees in the forest is always the same. (In this simplified model, we will not take into account trees that die between 
harvests. We assume that every seedling planted survives and grows until it is harvested.) 


In the marketplace, trees of different heights have different economic values. Suppose that there are n different price classes 
corresponding to certain height intervals, as shown in Table 1 and Figure 10.9.1.The first class consists of seedlings with heights 
in the interval [0, #1), and these seedlings are of no economic value. The nth class consists of trees with heights greater than or 


equal to йу]. 


Height of Tree 


Value of Tree 


Figure 10.9.1 


Table 1 


Value (dollars) 


Let x; (i = 1, 2,..., м) be the number of trees within the ith class that remain after each harvest. We form a column vector with 
the numbers and call it the nonharvest vector: 

^1 

^2 


For a sustainable harvesting policy, the forest is to be returned after each harvest to the fixed configuration given by the 
nonharvest vector x. Part of our problem is to find those nonharvest vectors x for which sustainable harvesting is possible. 


Because the total number of trees in the forest is fixed, we can set 

ХІХ 0c oBxX4-—8 (1) 
where s is predetermined by the amount of land available and the amount of space each tree requires. Referring to Figure 10.9.2, 
we have the following situation. The forest configuration is given by the vector x after each harvest. Between harvests the trees 


grow and produce a new forest configuration before each harvest. A certain number of trees are removed from each class at the 
harvest. Finally, a seedling is planted in place of each tree removed, to return the forest again to the configuration x. 


s Trees 
Lf а removed 
Fd 
га 
К. 
№ іх 
/ \, | ГМ s 
| v L „ V Y 
A MY a 
Р E 
t Forest after growth Trees not removed 3 
2 РА 
Oo Same E 
| forest z 
L_ configuration A 
Forest before growth Forest after harvest 
(nonharvest vector x) (nonharvest vector x) 
Figure 10.9.2 


Consider first the growth of the forest between harvests. During this period a tree in the ith class may grow and move up to a 


higher height class. Or its growth may be retarded for some reason, and it will remain in the same class. We consequently define 
the following growth parameters Zi fori = 1, 2,.., м = 1: 


gi = the fraction of trees in the ith class that grow into the (i + 1)-st class during a growth period 


For simplicity we assume that a tree can move at most one height class upward in one growth period. With this assumption, we 
have 


1 = 2; = ће fraction of trees in the ith class that remain in the ith class during a growth period 


With these ә — 1 growth parameters, we form the following » x » growth matrix: 


1-8 0 0 cre 0 
gi 1-8: 0 ПОСЕ 
бер: а Fir. Q) 
0 0 0 5 L—gy-1 0 
0 0 0 б En-1 1 
Because the entries of the vector x are the numbers of trees in the п classes before the growth period, you can verify that the 
entries of the vector 
(l=gi)x1 
g1x1 + (1 —g2)x2 
Ox = g2x2 + (1 = g3)x3 (3) 
En—2Xn-2 + (1 — Zn-1)Xn-1 
En-1Xn-1 Хи 
are the numbers of trees in the п classes after the growth period. 
Suppose that during the harvest we remove у; (i = 1, 2, ..., м) trees from the ith class. We will call the column vector 
У1 
У2 
y=|". 
Yn 
the Aarvest vector. Thus, a total of 
yi 2 ccc on 
trees are removed at each harvest. This is also the total number of trees added to the first class (the new seedlings) after each 
harvest. If we define the following » x » replacement matrix 
1. +. Pete. I3 
к=? е СС (4) 
0 0 0 
then the column vector 
ict ya Fn 
0 
Бу = 0 (5) 
0 


specifies the configuration of trees planted after each harvest. 


At this point we are ready to write the following equation, which characterizes a sustainable harvesting policy: 


i ti 
configuration EAE шша оп 
atendof | -— [harvest] 4 = | at beginning of 
: replacement | 
growth period growth period 
or mathematically, 
Gx-y+Ry =x 
This equation can be rewritten as 
(1- Юу =(G-/)x (6) 
or more comprehensively as 
бї —1 eee ot —1][ э cai 9 d$ 9. Mud 
0 Pu 0|| x2 gi —& 0 s 0 x5 
0 0 1] 0 Of}; X3 |. 0 g2 —£3 x3 
0 0 0 1 O}} ¥»-1 0 0 0 =—g,-1 0| 7-1 
0 0 0 1 Уи 0 0 0 aci Xn 


We will refer to Equation 6 as the sustainable harvesting condition. Any vectors x and y with nonnegative entries, and such that 

X{X2+ ху = 6, which satisfy this matrix equation, determine a sustainable harvesting policy for the forest. Note that 
if y, > 0, then the harvester is removing seedlings of no economic value and replacing them with new seedlings. Because there is 
no point in doing this, we assume that 


y1=0 (7) 


With this assumption, it can be verified that 6 is the matrix form of the following set of equations: 


У2+у3+ OCT Вур = EMI 
У2 = 81%*1—82%2 
Уз = 8202 — E333 
(8) 
Ум-1 = Sn—-2%n-2 — En-lX*n-1l 
Yn = &n-1%*n-1 
Note that the first equation in 8 is the sum of the remaining », — ] equations. 
Because we must have y, > 0 for i = 2, 3, .... м, Equations 8 require that 
81х12 82322 ``‘ SBy-1Xy-1 = 9 (9) 


Conversely, if x is a column vector with nonnegative entries that satisfy Equation 9, then 7 and 8 define a column vector y with 
nonnegative entries. Furthermore, x and y then satisfy the sustainable harvesting condition 6. In other words, a necessary and 
sufficient condition for a nonnegative column vector x to determine a forest configuration that is capable of sustainable 
harvesting is that its entries satisfy 9. 


Optimal Sustainable Yield 


Because we remove y; trees from the ith class (i = 2, 3, ..., з) and each tree in the ith class has an economic value of Pi, the 
total yield of the harvest, Y/d, is given by 


Иа = pya + paya +...+ Puyn (10) 


Using 8, we may substitute for the y;'s in 10 to obtain 


Иа = pagixi-- (рз — P2)82X2 +... (Pn — Pn-1)En-1Xn-1 (11) 


Combining 11, 1, and 9, we can now state the problem of maximizing the yield of the forest over all possible sustainable 
harvesting policies as follows: 


Problem 


Find nonnegative numbers x1, x2, ..., Ху that maximize 
fid = pagixi + (Рз — pa)gaxa + ...+ (Pn — Ри—1)Еп—1Хи-1 
subject to 
ХФ Х2 +... ху = 5 
апа 


E1312 82322... 2 Bn-1%Xn-1 20 


As formulated above, this problem belongs to the field of linear programming. However, we will illustrate the following result, 
without linear programming theory, by actually exhibiting a sustainable harvesting policy. 


THEOREM 10.9.1 Optimal Sustainable Yield 


The optimal sustainable yield is achieved by harvesting all the trees from one particular height class and none of the trees 
from any other height class. 


Let us first set 
Иа р. = yield obtained by harvesting all of the Ath class and none of the other classes 


The largest value of Уа р. for = 2, 5, ..., » will then be the optimal sustainable yield, and the corresponding value of k will be 
the class that should be completely harvested to attain the optimal sustainable yield. Because no class but the Kth is harvested, we 
have 

y27J3-...—Jk-1 —7Jk41 =--- =n = (12) 
In addition, because all of the Ath class is harvested, no trees are ever present in the height classes above the Ath class. Thus, 


Xk = Xk+ ==... Ху = 0 (13) 


Substituting 12 and 13 into the sustainable harvesting condition 8 gives 


Ye = B1%1 

0 = g1x1—g2%2 

0 = g2x2—g83%3 (14) 
0 = gk-2Xk—2— Ek-1Xk-1 

Yk =  Ek-AXk-l 


Equations 14 can also be written as 


Yk = 8141 = E222 — -..— £k-1Xk-1 


from which it follows that 


x2 = g1x1/g2 
хз = g1%1/ 83 
Xe-1 = EIX1/!Ek-1 
If we substitute Equations 13 and 16 into 
ХІ Хх2 +... ху = 5 
[which is Equation 1], we can solve for х] and obtain 
u 5 
Е Са Б. Zi 
14 і F... 
E2 E3 Ek-1 


For the yield Иа р, we combine 10, 12, 15, and 17 to obtain 
Vid = руз + P3ya t... Puyn 


— PkYk 
= PkEAX 
Е Ps 
1 1 1 
— += +... IM 
81 82 Ek-1 


(15) 


(16) 


(17) 


(18) 


Equation 18 determines Yid з. in terms of the known growth and economic parameters for any & = 2, 3, ..., м. Thus, the optimal 


sustainable yield is found as follows. 


THEOREM 10.9.2 Finding the Optimal Sustainable Yield 


The optimal sustainable yield is the largest value of 
PRE 
1 1 1 
— + — =... ——— 
81 E2 Ek-1 
for k = 2, 3, ..., м. The corresponding value of k is the number of the class that is completely harvested. 


In Exercise 4 we ask you to show that the nonharvest vector x for the optimal sustainable yield is 


ligi 
liga 
= | 1'gkA 
l.l. +- | o 
Zi 82 Ek-1 
0 
0 


Theorem 10.9.2 implies that it is not necessarily the highest-priced class of trees that should be totally cropped. The growth 


parameters Ej must also be taken into account to determine the optimal sustainable yield. 


EXAMPLE 1 Using Theorem 10.9.2 < 


(19) 


For a Scots pine forest in Scotland with a growth period of six years, the following growth matrix was found (see 
M. B. Usher, *A Matrix Approach to the Management of Renewable Resources, with Special Reference to Selection 
Forests," Journal of Applied Ecology, vol. 3, 1966, pp. 355—367): 


72 0 0 0 0 0 

28 69 ооо 0 
G= 0 31.75 0 0 0 
0 25 77 0 0 

0 

0 


0 
0 0 0 23 63 
0 0 0 0 .37 1.00 


Suppose that the prices of trees in the five tallest height classes are 
Р2 = $50, рз = $100, p4= $150, p5 = $200, рв = $250 
Which class should be completely harvested to obtain the optimal sustainable yield, and what is that yield? 


Solution From matrix G we have that 
gi —.28, g2=.31, g3=.25, g4=.23, g5—.37 
Equation 18 then gives 
Yidj = 50s/(.28 1)— 14.0 
Yida = 100s/(.281 + . 317!) 2147s 
Ийд = 150s/(.287! + .311 + . 257!) = 13.95 
Yids = 200cf(.287'+ 317! + . 2571 + . 2371) = 13.2s 
Yidg = 250s/(.28 14.311.251 .233 4 . 377!) = 14.05 


We see that Fld; 15 the largest of these five quantities, so from Theorem 10.9.2 the third class should be completely 
harvested every six years to maximize the sustainable yield. The corresponding optimal sustainable yield is $14.75, 
where s is the total number of trees in the forest. 


Exercise Set 10.9 


1. A certain forest is divided into three height classes and has a growth matrix between harvests given by 


L 
590 
=|1 1 
G=|5 3 0 
2 
021 


If the price of trees in the second class is $30 and the price of trees in the third class is $50, which class should be completely 
harvested to attain the optimal sustainable yield? What is the optimal yield if there are 1000 trees in the forest? 


Answer: 


The second class; $15,000 


2. In Example 1, to what level must the price of trees in the fifth class rise so that the fifth class is the one to harvest completely 
in order to attain the optimal sustainable yield? 


Answer: 


$223 
3. In Example 1, what must the ratio of the prices рз: рз: pa: p5: pa be in order that the yields Yid p, k = 2, 5, 4, 5, 6, all be the 


same? (In this case, any sustainable harvesting policy will produce the same optimal sustainable yield. 


Answer: 


1:1.90:3.02:4.24:5.00 
4. Derive Equation 19 for the nonharvest vector x corresponding to the optimal sustainable harvesting policy described in 
Theorem 10.9.2. 


5. For the optimal sustainable harvesting policy described in Theorem 10.9.2, how many trees are removed from the forest 
during each harvest? 


Answer: 


—1 —1 —1 
si (gj gil +) 


6. If all the growth parameters g1, 23, ..., Z,—1 in the growth matrix G are equal, what should the ratio of the prices 
D3:pa:-.: Py be in order that any sustainable harvesting policy be an optimal sustainable harvesting policy? (See Exercise 3.) 


Answer: 


1:2:3: * *:n - 1 


Section 10.9 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, 
Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some 
linear algebra capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are 
using. The goal of these exercises is to provide you with a basic proficiency with your technology utility. Once you have 
mastered the techniques in these exercises, you will be able to use your technology utility to solve many of the problems in the 
regular exercise sets. 


T1. A particular forest has growth parameters given by 


fori—1,2,3,.., м = 1, where n (the total number of height classes) can be chosen as large as needed. Suppose that the value of 
a tree in the Kth height interval is given by 
py —a(k — 1)" 
where a is a constant (in dollars) and p is a parameter satisfying 1 < р < 2. 
(a) Show that the yield УЙ р, is given by 
ETE 
Wes 2a(k = 1)? s 
k 
(b) For 
p= 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9 
use a computer to determine the class number that should be completely harvested, and determine the optimal sustainable 
yield in each case. Make sure that you allow K to take on only integer values in your calculations. 


(c) Repeat the calculations in part (b) using 
p= 1.91, 1.92, 1.93, 1.94, 1.95, 


1.96, 1.97, 1.98, 1.99 


(d) Show that if p = 2, then the optimal sustainable yield can never be larger than 2as. 
(e) Compare the values of k determined in parts (b) and (c) to 1 / (2 — p), and use some calculus to explain why 
1 
kc —— 
2—p 


T2. A particular forest has growth parameters given by 


af 

2! 

fori —1,2,3,.., м = 1, where n (the total number of height classes) can be chosen as large as needed. Suppose that the value of 
a tree in the Kth height interval is given by 


Ei— 


pk —a(k — 1)" 
where a is a constant (in dollars) and p is a parameter satisfying 1 « p. 
(a) Show that the yield Fd; is given by 
— P 
Yid, = д: 1)fs 
2 —2 
(b) For 
р= 1, 2, 3,4, 5, 6, 7, 8, 9, 10 
use а computer to determine the class number that should be completely harvested in order to obtain an optimal yield, and 
determine the optimal sustainable yield in each case. Make sure that you allow К to take on only integer values in your 
calculations. 


(c) Compare the values of k determined in part (b) to 1 + p / In(2) and use some calculus to explain why 


A E EE MN 
ke 14 (2) 
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10.10 Computer Graphics 


In this section we assume that a view of a three-dimensional object is displayed оп a video screen and show 
how matrix algebra can be used to obtain new views of the object by rotation, translation, and scaling. 


Prerequisites 


Matrix Algebra 
Analytic Geometry 


Visualization of a Three-Dimensional Object 


Suppose that we want to visualize a three-dimensional object by displaying various views of it on a video 
screen. The object we have in mind to display is to be determined by a finite number of straight line segments. 
As an example, consider the truncated right pyramid with hexagonal base illustrated in Figure 10.10.1. We first 
introduce an xyz-coordinate system in which to embed the object. As in Figure 10.10.1, we orient the coordinate 
system so that its origin is at the center of the video screen and the xy-plane coincides with the plane of the 
screen. Consequently, an observer will see only the projection of the view of the three-dimensional object onto 
the two-dimensional xy-plane. 


Figure 10.10.1 


In the xyz-coordinate system, the endpoints P4, P», ..., Pp ofthe straight line segments that determine the view 
of the object will have certain coordinates—say, 
(х1, 1. Z1). (X2, У2, Z2), --- (Xn, Yn: Zn) 


These coordinates, together with a specification of which pairs are to be connected by straight line segments, 


are to be stored in the memory of the video display system. For example, assume that the 12 vertices of the 
truncated pyramid in Figure 10.10.1 have the following coordinates (the screen is 4 units wide by 3 units high): 


P,:(1.000, — 800, .000), P4:(.500, —.800, — 866), 
P4:( —.500, —.800, — 866), P4: ( — 1.000, —.800, .000), 
Ps:(—.500, — 800, .866), Р,:(.500, — 800, .866), 
Рз: (.840, — 400, .000), Pg: (.315, 125, — .546), 
Pg: ( = .210,.650, — 364), Pig: ( — .360, .800, .000), 
Py (— 210, .650, 364), P413: (315, .125, .546) 


These 12 vertices are connected pairwise by 18 straight line segments as follows, where P; « P j denotes that 
point P; is connected to point 3: 

Pie Po, Pae Рз, Рзе Ра Pae Ps, P5 € Ps, Pg e£, 

Pie Pg, Pee Py, Powe Pig, Ре Р, Pipe Piz, Pie Pa, 

Pie Р, Pae Pg, Pree Po, Pee Pig, Pse Py, Pee P12 
In View 1 these 18 straight line segments are shown as they would appear on the video screen. It should be 
noticed that only the x- and y-coordinates of the vertices are needed by the video display system to draw the 


view, because only the projection of the object onto the xy-plane 15 displayed. However, we must keep track of 
the z-coordinates to carry out certain transformations discussed later. 


We now show how to form new views of the object by scaling, translating, or rotating the initial view. We first 
construct a 3 x у matrix P, referred to as the coordinate matrix of the view, whose columns are the coordinates 
of the п points of a view: 


Xj X2 . К» 
Р=|У\ X2 --- Yn 
21 22 ... Zy 


For example, the coordinate matrix P corresponding to View 1 is the 3 x 12 matrix 


1.000 500 —500 -1.000 —.500 .500 .840 ‚315 —210 —360 —210 .315 
—.800 —.800 —.800 —800 —.800 —800 —400 ‚125 ‚650 ‚800 ‚650 .125 
000 —.866 —.866 .000 .866 ‚866 000 —546 —364 000 ‚364.546 


We will show below how to transform the coordinate matrix P of a view to a new coordinate matrix P' 
corresponding to a new view of the object. The straight line segments connecting the various points move with 
the points as they are transformed. In this way, each view is uniquely determined by its coordinate matrix once 
we have specified which pairs of points in the original view are to be connected by straight lines. 


Scaling 


The first type of transformation we consider consists of scaling a view along the x, y, and z directions by factors 
of a, В, and y, respectively. By this we mean that if a point P; has coordinates (x,, у;, z;) in the original view, it 
is to move to a new point P with coordinates (@x,, Зуу, тулу) in the new view. This has the effect of 
transforming a unit cube in the original view to a rectangular parallelepiped of dimensions a x {7 x ^; (Figure 


10.10.2). Mathematically, this may be accomplished with matrix multiplication as follows. Define a 3 x 3 
diagonal matrix 


Then, if a point Р; in the original view is represented by the column vector 
x 1 
Yi 
2] 


then the transformed point Р; Is represented by the column vector 


zi | [a 0 0][х; 
Yi = 0 8 0 Yi 
z! 0 0 yj||Zi 


Using the coordinate matrix P, which contains the coordinates of all n points of the original view as its columns, 
we can transform these п points simultaneously to produce the coordinate matrix P' of the scaled view, as 
follows: 


a 0 0 |[ху x2 Xy 
SP = |0 8 0|\уу Ууз Yn 
0 0 5 (|21 22 Zn 
ОХ] QX2 ... аху 
= |81 буз .. By. | =P" 
Zi ^72 --- "Zn 


The new coordinate matrix can then be entered into the video display system to produce the new view of the 
object. As an example, View 2 is View 1 scaled by setting œ = 1.8, 3 = 0.5, and y = 3.0. Note that the scaling 
ү = 3.0 along the z-axis is not visible in View 2, since we see only the projection of the object onto the 
xy-plane. 


(b) 


Figure 10.10.2 


һә 


2 1 0 1 


View2 View 1 scaled by œ = 1.8, 8 = 0.5, у= 3.0 


Translation 
We next consider the transformation of translating or displacing an object to a new position on the screen. 


Referring to Figure 10.10.3, suppose we desire to change an existing view so that each point P; with 
coordinates (x; у;, 2;) moves to a new point Pi with coordinates (x; + хү, y; + yg, 2; + zg)- The vector 


is called the translation vector of the transformation. By defining a 3 x 5; matrix T as 


xp Xo --- Xp 
T-|X0 Yo --- Yo 
20 Z0 ... 20 


we can translate all л points of the view determined by the coordinate matrix Р by matrix addition via the 
equation 
Р'=Р+Т 

The coordinate matrix P" then specifies the new coordinates of the n points. For example, if we wish to 
translate View 1 according to the translation vector 

1.2 

0.4 

1.7 


the result is View 3. Note, again, that the translation zg = 1.7 along the z-axis does not show up explicitly in 
View 3. 


View 3 View 1 translated by xgz1.2, yg 0.4, 252137. 


>" 4 - d 
PAX; + ху у, + Уу Z + 29) 


PAX;, Yi z;) 


Figure 10.10.3 


In Exercise 7, a technique of performing translations by matrix multiplication rather than by matrix addition is 
explained. 


Rotation 


A more complicated type of transformation is a rotation of a view about one of the three coordinate axes. We 
begin with a rotation about the z-axis (the axis perpendicular to the screen) through an angle 0. Given a point Р; 
in the original view with coordinates (х; у;, 2;), We wish to compute the new coordinates (xi А yi е 2 ) of the 


rotated point Рі . Referring to Figure 10.10.4 and using a little trigonometry, you should be able to derive the 
following: 

х; = р соз(ф + Ө) = p cos ф cosÜ —psin ó sin = x; cos 0 — y; sin б 

y; =p sin(ó + 8) = p cos Фф sin Ê + p sin o cos 0 = x; sin B + y; cos 0 

z! =z; 
These equations can be written in matrix form as 

x! 

d cosh —sin 0 || х; 
yi =|sn@ cos 02У: 
z! 0 0 1 || 4 


If we let R denote the 3 x 3 matrix in this equation, all n points can be rotated by the matrix product P" = RP to 
yield the coordinate matrix P" of the rotated view. 


PAX}. у}, zu) 


Figure 10.10.4 


Rotations about the x- and y-axes can be accomplished analogously, and the resulting rotation matrices are 
given with Views 4, 5, and 6. These three new views of the truncated pyramid correspond to rotations of View 1 
about the x-, y-, and z-axes, respectively, each through an angle of 90°. 


Rotation about the x-axis 


О cos@ -sin 
О sind cos 


View 4 View 1 rotated 90° about the x-axis 


Rotation about the y-axis 


ds 


cos 0 sind 
0 1 0 
-snö О cos 


cos -sing 0 
sin  cos8 0 
0 0 1 


View 6 View 1 rotated 90° about the z-axis. 


Rotations about three coordinate axes may be combined to give oblique views of an object. For example, View 
7 1s View 1 rotated first about the x-axis through 30°, then about the y-axis through —70°, and finally about the 
z-axis through —27°. Mathematically, these three successive rotations can be embodied in the single 
transformation equation P' — RP, where R is the product of three individual rotation matrices: 
1 0 0 
В| = |0 cos(30°) —sin(30°) 
0 sin(30°)  cos(309) 


cos( = 70°) 0 smn(—709) 

Кз = 0 1 0 
—зш( = 70°) 0 cos(— 70°) 
cos( = 27°) — ш(—27°) 0 
Кз = | sin(—27°) cos(—279) 0 
0 0 1 

in the order 

305 =.025 =—.952 
R= RRR = | —.155 .985 —.076 
940  .171  .296 


View 7 Oblique view of truncated pyramid. 


As a final illustration, in View 8 we have two separate views of the truncated pyramid, which constitute a 
stereoscopic pair. They were produced by first rotating View 7 about the y-axis through an angle of —3° and 
translating it to the right, then rotating the same View 7 about the y-axis through an angle of --3° and 


translating it to the left. The translation distances were chosen so that the stereoscopic views are about 2l 


inches apart—the approximate distance between a pair of eyes. 


View 8 Stereoscopic figure of truncated pyramid. The three-dimensionality of the diagram can be seen 
by holding the book about one foot away and focusing on a distant object. Then by shifting your 
gaze to View 8 without refocusing, you can make the two views of the stereoscopic pair merge 
together and produce the desired effect. 


Exercise Set 10.10 


1. View 9 is a view of a square with vertices (0, 0, 0), (1, 0, 0), (1, 1, 0), and (0, 1, 0). 
(a) What is the coordinate matrix of View 9? 


1 


(b) What is the coordinate matrix of View 9 after it is scaled by a factor ii in the x-direction and 2 in the 
y-direction? Draw a sketch of the scaled view. 
(c) What is the coordinate matrix of View 9 after it is translated by the following vector? 
—2 
—1 
E 


Draw a sketch of the translated view. 


(d) What is the coordinate matrix of View 9 after it is rotated through an angle of —30° about the z-axis? 
Draw a sketch of the rotated view. 


= -l 0 1 2 


ü 


Ex-View 9 Square with vertices (0, 0, 0), (1, 0, 0), (1, 1, 0), and (0, 1, 0) (Exercises 1 and 2) 


Answer 
oiri 
0011 
0000 
(b) 3-4 
0230 
4d 1 
0011 
0000 
ОДЕТ 
all] 0 0 
1з Ww 3 


(d) 
0  .866 1.366 .500 


0 —500 .366 .866 
0 0 0 0 


2. (a) If the coordinate matrix of View 9 is multiplied by the matrix 


l 
bod 
010 
001 


the result is ће coordinate matrix of View 10. Such a transformation is called a shear in the x-direction 


with factor 1 with respect to the y-coordinate. Show that under such a transformation, a point with 
coordinates (x,, y;, ту) has new coordinates (x; + i». Yi. 24). 


(b) What are the coordinates of the four vertices of the shear square in View 10? 


(c) The matrix 


1 0 
j l 
0 0 


—-— © 


determines a shear in the y-direction with factor .6 with respect to the x-coordinate (an example appears 
in View 11). Sketch a view of the square in View 9 after such a shearing transformation, and find the 
new coordinates of its four vertices. 


Ex-View 10 View 9 sheared along the x-axis by 1 with respect to the y-coordinate (Exercise 2) 


Ex-View 11 View 1 sheared along the y-axis by .6 with respect to the x-coordinate (Exercise 2). 


Answer: 


(b) 


(0,0,0), (1, 0, 0), (12.1.0). and 3.10) 


(c) (0, 0, 0), (1,.6, 0), (1, 1.6, 0), (0, 1, 0) 


3. (a) The reflection about the xz-plane is defined as the transformation that takes a point (x;, у;, 2;) to the 
point (xi, — yj, 2;) (e.g., View 12). If P and Р! are the coordinate matrices of a view and its reflection 
about the xz-plane, respectively, find a matrix M such that P' = M P. 


(b) Analogous to part (a), define the reflection about the yz-plane and construct the corresponding 
transformation matrix. Draw a sketch of View 1 reflected about the yz-plane. 


(c) Analogous to part (a), define the reflection about the xy-plane and construct the corresponding 
transformation matrix. Draw a sketch of View 1 reflected about the xy-plane. 


Ex-View 12 View 1 reflected about the xz-plane (Exercise 3). 


Answer: 

(a) |1 
0 —1 
0 0 

b 

(5) =1 0 0 
010 
00 1 

Oro о 
0 1 0 
00 —1 


4. (a) View 13 is View 1 subject to the following five transformations: 


І. Scale by a factor of i in the x-direction, 2 in the y-direction, and i in the z-direction. 


1 


2. Translate 2 unit in the x-direction. 


3. Rotate 20° about the x-axis. 
4. Rotate —45? about the y-axis. 
5. Rotate 9()? about the z-axis. 
Construct the five matrices M 1, M 2, M 4, M 4, and M 5 associated with these five transformations. 


(b) If P is the coordinate matrix of View 1 and P' is the coordinate matrix of View 13, express P" in terms 


of M1, Ma, Mz, Ма, М5, and P. 


Ex-View 13 View 1 scaled, translated, and rotated (Exercise 4) 


Answer: 
(a) 1 0 0 i 4 1 1 0 0 
я 23 2 : А 
Мі=| 0 2 0|, № = 0 0 MI M5-—|0 cos20 —sin20 |, 
0 0 i 0 0 0 0 sin20  cos20 


cos(—45) 0 sin(—45) 0 
M4= 0 1 0 , М;=|1 
—sin (—45) 0 cos(—45) 0 


(b) Р' = M;MAM3(M|,P + M3) 


5. (a) View 14 is View 1 subject to the following seven transformations: 


. Scale by a factor of .3 in the x-direction and by a factor of .5 in the y-direction. 
. Rotate 45° about the x-axis. 

. Translate 1 unit in the x-direction. 

. Rotate 35? about the y-axis. 

. Rotate —45? about the z-axis. 


С nA A UU N н 


. Translate 1 unit in the z-direction. 
7. Scale by a factor of 2 in the x-direction. 
Construct the matrices M, M 5, ..., №7 associated with these seven transformations. 


(b) If P is the coordinate matrix of View 1 and P” is the coordinate matrix of View 14, express P" in terms 
of M1, M5,..., Му, and P. 


Ex-View 14 View 1 scaled, translated, and rotated (Exercise 5). 


Answer: 
(a) 300 1 0 0 i1 1 
Mi=| 0 5 0|, M3—2|0 cos45  —sin45 |, Мз=|0 0 - 0 |, 
0 01 0 sin45  cos45 0 0 0 
502957 QU du35 cos(—45) -sin(—45) 0 
Ma—-| 0 1 0 |, Ms—|s(L-45) cos(—45) 0} 
—sin35 0 cos 35 0 0 1 
0 0 0 200 
Ms=|0 0 --. 0|, Му=|0 10 
11 1 001 


(b) P' = Ma3(MsSM4(M;M,P + Мз) + Mg) 


6. Suppose that a view with coordinate matrix P is to be rotated through an angle 0 about an axis through the 
origin and specified by two angles a and p (see Figure Ex-6). If P is the coordinate matrix of the rotated 
view, find rotation matrices R1, R5, Аз, Жл, and Rs such that 


P! = RRíAR4R3RUP 
[Hint: The desired rotation can be accomplished in the following five steps: 
1. Rotate through an angle of B about the y-axis. 
2. Rotate through an angle of a about the z-axis. 
3. Rotate through an angle of 0 about the y-axis. 
4. Rotate through an angle of —a about the z-axis. 
5. Rotate through an angle of —B about the y-axis.] 


Figure Ex-6 


Answer: 


cos 8 0 sin 8 cos —sina 0 


R = 0 1 0 |, Аз=|зш@: cosa 0 |, 
=sinĝ 0 cos 0 0 1 
cos# 0 ш cosa sina 0 

Аз=| 0 1 0 | A&4-2|-sna cosa 0 |, 
—sinÜ 0 cos 0 0 1 


cosG 0 =—sn 
Rs=| 0 1 0 
sn 0 cos 
7. This exercise illustrates a technique for translating a point with coordinates (x,, y,, ту) to a point with 
coordinates (x; + xp, у; + yg, 2; + zg) by matrix multiplication rather than matrix addition. 


(a) Let the point (x,;, y,, zj) be associated with the column vector 


and let the point (x, + xp, у; + yp, Zi + 20) be associated with the column vector 
Xi + xg 
Yi + YO 
2; +20 
1 


Find a 4 x 4 matrix M such that vi = Mw. 


(b) Find the specific 4 x 4 matrix of the above form that will effect the translation of the point (4, = 2, 3) 
to the point ( — 1, 7, 0). 


Answer: 
(a) 10 0 xp 
M= 0 10 yo 
0 0 1 zp 
000 1 
6 1100 =5 
010 9 
001 -3 
0.0 0 1 


8. For the three rotation matrices given with Views 4, 5, and 6, show that 
R -1 -— T 


(A matrix with this property is called an orthogonal matrix. See Section 7.1.) 


Section 10.10 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the relevant 
documentation for the particular utility you are using. The goal of these exercises is to provide you with a 
basic proficiency with your technology utility. Once you have mastered the techniques in these exercises, you 
will be able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. Let (a, b, c) be a unit vector normal to the plane gx + by + cz = 0, and let p = (x, y, z) be a vector. It 
can be shown that the mirror image of the vector r through the above plane has coordinates 


Dy = Xs Mas m, where 


Xm x 
Ут |= MY 
Zm 2 
with 
100 a 
M-i-2m^7-|0 1 0|-2|5|[a b c] 
001 с 


(a) Show that Д2 = у and give a physical reason why this must be so. [Hint: Use the fact that (a, b, c) isa 


unit vector to show that nn = 1.] 


(b) Use a computer to show that det( M) — — 1. 

(c) The eigenvectors of M satisfy the equation 
Xm x x 
Ym|= М|У |= АУ 
Zm 2 2 


and therefore correspond to those vectors whose direction is not affected by a reflection through the plane. 
Use a computer to determine the eigenvectors and eigenvalues of M, and then give a physical argument to 
support your answer. 


T2. A vector y — (x, y, z) is rotated by an angle 0 about an axis having unit vector (a, 5, c), thereby forming 


the rotated vector vp = (хр, ур, Zp). It can be shown that 


XR x 

УВ |= &(0)|У 

2р 2 

with 
100 a 
R(B)-—cos(8)| 0 1 0 |+ (1 соз(0)) | b [а b c] 
00 1 4 
0 =c Ё 


--sn(0)| c 0 =g 


(a) Use a computer to show that Я(#А(р) = R(B +), and then give a physical reason why this must be so. 
Depending on the sophistication of the computer you are using, you may have to experiment using different 


values of a, b, and 
EE ea b" 


(b) Show also that Ro (8) = R( = f) and give a physical reason why this must be so. 
(c) Use a computer to show that det(R(8)) = + 1. 
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10.11 Equilibrium Temperature Distributions 


In this section we will see that the equilibrium temperature distribution within a trapezoidal plate can be found 
when the temperatures around the edges of the plate are specified. The problem is reduced to solving a system of 


linear equations. Also, an iterative technique for solving the problem and a “random walk" approach to the 
problem are described. 


Prerequisites 


Linear Systems 
Matrices 


Intuitive Understanding of Limits 


Boundary Data 


Suppose that the two faces of the thin trapezoidal plate shown in Figure 10.11.1a are insulated from heat. Suppose 
that we are also given the temperature along the four edges of the plate. For example, let the temperature be 
constant on each edge with values of 0°, 0°, 1°, and 2°, as in the figure. After a period of time, the temperature 
inside the plate will stabilize. Our objective in this section is to determine this equilibrium temperature distribution 
at the points inside the plate. As we will see, the interior equilibrium temperature is completely determined by the 
boundary data—that is, the temperature along the edges of the plate. 


Temperature = 2° 


0.00 


Temperature = |‘ 
Ж... 2 экы 


(а) (b) 
Figure 10.11.1 
The equilibrium temperature distribution can be visualized by the use of curves that connect points of equal 


temperature. Such curves are called isotherms of the temperature distribution. In Figure 10.11.15 we have 
sketched a few isotherms, using information we derive later in the chapter. 


Although all our calculations will be for the trapezoidal plate illustrated, our techniques generalize easily to a plate 
of any practical shape. They also generalize to the problem of finding the temperature within a three-dimensional 
body. In fact, our “plate” could be the cross section of some solid object if the flow of heat perpendicular to the 
cross section is negligible. For example, Figure 10.11.1 could represent the cross section of a long dam. The dam is 
exposed to three different temperatures: the temperature of the ground at its base, the temperature of the water on 
one side, and the temperature of the air on the other side. A knowledge of the temperature distribution inside the 
dam is necessary to determine the thermal stresses to which it is subjected. 


Next we will consider a certain thermodynamic principle that characterizes the temperature distribution we are 


seeking. 


The Mean-Value Property 


There are many different ways to obtain a mathematical model for our problem. The approach we use is based on 
the following property of equilibrium temperature distributions. 


THEOREM 10.11.1 The Mean-Value Property 


Let a plate be in thermal equilibrium and let P be a point inside the plate. Then if C is any circle with 
center at P that is completely contained in the plate, the temperature at P is the average value of the 
temperature on the circle (Figure 10.11.2). 


©; 


Figure 10.11.2 


This property is a consequence of certain basic laws of molecular motion, and we will not attempt to derive it. 

Basically, this property states that in equilibrium, thermal energy tends to distribute itself as evenly as possible 
consistent with the boundary conditions. It can be shown that the mean-value property uniquely determines the 
equilibrium temperature distribution of a plate. 


Unfortunately, determining the equilibrium temperature distribution from the mean-value property is not an easy 
matter. However, if we restrict ourselves to finding the temperature only at a finite set of points within the plate, 
the problem can be reduced to solving a linear system. We pursue this idea next. 


Discrete Formulation of the Problem 


We can overlay our trapezoidal plate with a succession of finer and finer square nets or meshes (Figure 10.11.3). In 
(a) we have a rather coarse net; in (b) we have a net with half the spacing as in (a); and in (c) we have a net with 
the spacing again reduced by half. The points of intersection of the net lines are called mesh points. We classify 
them as boundary mesh points if they fall on the boundary of the plate or as interior mesh points if they lie in the 
interior of the plate. For the three net spacings we have chosen, there are 1, 9, and 49 interior mesh points, 
respectively. 


= 
ШШ! 


A 1 interior mesh point (b) 9 interior mesh points (C) 49 interior mesh points 
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Figure 10.11.3 


In the discrete formulation of our problem, we try to find the temperature only at ће interior mesh points of some 
particular net. For a rather fine net, as in (c), this will provide an excellent picture of the temperature distribution 
throughout the entire plate. 


At the boundary mesh points, the temperature is given by the boundary data. (In Figure 10.11.3 we have labeled all 
the boundary mesh points with their corresponding temperatures.) At the interior mesh points, we will apply the 
following discrete version of the mean-value property. 


THEOREM 10.11.2 Discrete Mean-Value Property 


At each interior mesh point, the temperature is approximately the average of the temperatures at the four 
neighboring mesh points. 


This discrete version is a reasonable approximation to the true mean-value property. But because it is only an 
approximation, it will provide only an approximation to the true temperatures at the interior mesh points. However, 
the approximations will get better as the mesh spacing decreases. In fact, as the mesh spacing approaches zero, the 
approximations approach the exact temperature distribution, a fact proved in advanced courses in numerical 
analysis. We will illustrate this convergence by computing the approximate temperatures at the mesh points for the 
three mesh spacings given in Figure 10.11.3. 


Case (a) of Figure 10.11.3 is simple, for there is only one interior mesh point. If we let £y be the temperature at this 


mesh point, the discrete mean-value property immediately gives 
в=1(2+ 14-04-0) 20.75 


In case (b) we can label the temperatures at the nine interior mesh points £1, £5, ..., £9, as in Figure 10.11.35. (The 
particular ordering is not important.) By applying the discrete mean-value property successively to each of these 
nine mesh points, we obtain the following nine equations: 


fi = 402 $2300) 

£37 1 t3 ta 2) 

з=] +t5+0+0) 

= 13 t5 t5 2) 

i ius + £4 2- tg + fg) (1) 
= 105 + to +0 +0) 

= 1044 ideo 

в = 1s tg to 1) 


fg = is 4- £g + 1+0) 


This is a system of nine linear equations in nine unknowns. We can rewrite it in matrix form as 


t= Mt+b (2) 
where 
о 1с о о о о о 0 
4 1 
1 li. 1 2 
1012100000 
у, 1 1 i 
45 0100100 0 0 : 
£3 1 1 1 
0 {4 6 0 4! 6 1 6 0 d 
ii 4 4 4 5 
t— 12; м=|0 о 2104020 b=| 0 
ы 000020001 ; 
ї7 4 4 i 
tg 00020004 0 j 
‘9 1 1 1 4 
1. 1. 
00000704 0 


To solve Equation 2, we write it as 


(i= Мус=Ъ 


The solution for t is thus 
t- (— M) b (3) 


as long as the matrix (7 — M) is invertible. This is indeed the case, and the solution for t as calculated by 3 is 


0.7846 
1.1383 
0.4719 
1.2967 
t— | 0.7491 (4) 
0.3265 
1.2995 
0.9014 
0.5570 


Figure 10.11.415 a diagram of the plate with the nine interior mesh points labeled with their temperatures as given 
by this solution. 


1.2967 — 0.7491 — 0.3265 


1.2995 — 0,9014 — 0,5570 


Figure 10.11.4 


For case (c) of Figure 10.11.3, we repeat this same procedure. We label the temperatures at the 49 interior mesh 
points as £1, £3, ..., £49 in some manner. For example, we may begin at the top of the plate and proceed from left to 
right along each row of mesh points. Applying the discrete mean-value property to each mesh point gives a system 
of 49 linear equations in 49 unknowns: 


ipm 1@+2+0+0) 


i9 = iui Різ +4 + 2) 
(5) 
1 
tag = 444 + £47 + £49 + 1) 
ig = ia F£ag + 0-+Е1) 
In matrix form, Equations 5 аге 
t— Mt--b 
where t and b are column vectors with 49 entries, and M is a 49 x 49 matrix. As in 3, the solution for t is 
t-- M) b (6) 


In Figure 10.11.5 we display the temperatures at the 49 mesh points found by Equation 6. The nine unshaded 
temperatures in this figure fall on the mesh points of Figure 10.11.4. 
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1.1611 — 0,4915 


1.3625 — 0,8048 — 0.3528 


1.4844 — 1.0122 — 0.6064 — 0.2710 


1.5627 — 1.1533 — 0.7896 — 0.4778 — 0.2162 


1.6131 — 1.2488 — 0.9210 — 0,6342 — 0.3868 — 0.1756 
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1.6409— 1.3078 — 1.0114 — 0.7513 — 0.5214 — 0.3157 — 0.1344 


1.6426 — 1.3301 — 1.0657 — 0.8380 — 0.6318 — 0.4312 — 0.2221 


1.5994 — 1.3042 — 1.0834 — 0,9032 — 0,7365 — 0,5554 — 0,3227 


1.4508 — 1.2039 — 1.0605 — 0.9548 — 0.8556 — 0.7311 — 0.5135 
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Figure 10.11.5 


In Table 1 we compare the temperatures at these nine common mesh points for the three different mesh spacings 
used. 


Table 1 


Temperatures at Common 
Mesh Points 


0.7846 | 0.8048 
1.1383 1.1533 


0.4719 | 0.4778 
1.2967 1.3078 
0.749] | 0.7513 
0.3265 | 0.3157 
1.2995 1.3042 
0.9014 | 0.9032 
0.5570 | 0.5554 


Knowing that the temperatures of the discrete problem approach the exact temperatures as the mesh spacing 
decreases, we may surmise that the nine temperatures obtained in case (c) are closer to the exact values than those 
in case (b). 


A Numerical Technique 


To obtain the 49 temperatures in case (c) of Figure 10.11.3, it was necessary to solve a linear system with 49 
unknowns. A finer net might involve a linear system with hundreds or even thousands of unknowns. Exact 
algorithms for the solutions of such large systems are impractical, and for this reason we now discuss a numerical 
technique for the practical solution of these systems. 

To describe this technique, we look again at Equation 2: 


t= Mt--b (7) 


The vector t we are seeking appears on both sides of this equation. We consider a way of generating better and 
better approximations to the vector solution t. For the initial approximation ¢©) we can take t® — () if no better 


choice is available. If we substitute ¢ into the right side of 7 and label the resulting left side as ((D, we have 
0 = MO | b (8) 
If we substitute ¢{) into the right side of 7, we generate another approximation, which we label 22: 


tO — MeO +b (9) 


Continuing in this way, we generate a sequence of approximations as follows: 


© = aO), 
COL o aD. 


= мъ 
É) = М0 +ъ 


One would hope that this sequence of approximations tO, 40, (0) 


(10) 


‚ ... converges to the exact solution of 7. We do 


not have the space here to go into the theoretical considerations necessary to show this. Suffice it to say that for the 
particular problem we are considering, the sequence converges to the exact solution for any mesh size and for any 


initial approximation (00. 


This technique of generating successive approximations to the solution of 7 is a variation of a technique called 


Jacobi iteration; the approximations themselves are called iterates. As a numerical example, let us apply Jacobi 


iteration to the calculation of the nine mesh point temperatures of case (b). Setting t® — 0, we have, from 


Equation 2, 


tD = м +b = MO +b =b = 


о ын 
o 
o 
o 
o 
o 
o 
o 


о |н 


o 
o 
o 
o 


о |н 
о |н 


© 
© 
о Aj- 
о Aj- 
© о |н 
о ын o © 


|н 


Some additional iterates are 


.5000 
.5000 
‚0000 
‚5000 
‚0000 
‚0000 
‚7500 
‚2500 
‚2500 


5000 
5000 
‚0000 
‚5000 
‚0000 
‚0000 
7500 
‚2500 
‚2500 


‚5000 
‚5000 
‚0000 
.5000 


‚0000 


‚0000 
7500 
‚2500 
‚2500 


‚6250 
‚7500 
.1250 
.8125 
.1875 
.0625 
8375 
.5000 
.3125 


0.6875 0.7791 0.7845 0.7846 


0.8906 1.1230 1.1380 1.1383 
0.2344 0.4573 0.4716 0.4719 
0.9688 1.2770 1.2963 1.2967 

t® =| 0.3750 |, «0D = | 0.7236 |, «(90 = | 0.7486 |, (90 = | 0.7491 
0.1250 0.3131 0.3263 0.3265 
1.0781 1.2848 1.2992 1.2995 
0.6094 0.8827 0.9010 0.9014 
0.3906 0.5446 0.5567 0.5570 


All iterates beginning with the thirtieth are equal to ¢S to four decimal places. Consequently, (0 is the exact 


solution to four decimal places. This agrees with our previous result given in Equation 4. 


The Jacobi iteration scheme applied to the linear system 5 with 49 unknowns produces iterates that begin repeating 
to four decimal places after 119 iterations. Thus, {119 would provide the 49 temperatures of case (c) correct to 
four decimal places. 


A Monte Carlo Technique 


In this section we describe a so-called Monte Carlo technique for computing the temperature at a single interior 
mesh point of the discrete problem without having to compute the temperatures at the remaining interior mesh 
points. First we define a discrete random walk along the net. By this we mean a directed path along the net lines 
(Figure 10.11.6) that joins a succession of mesh points such that the direction of departure from each mesh point is 
chosen at random. Each of the four possible directions of departure from each mesh point along the path is to be 
equally probable. 


Figure 10.11.6 


By the use of random walks, we can compute the temperature at a specified interior mesh point on the basis of the 
following property. 


THEOREM 10.11.3 Random Walk Property 


Let W1, Wa, -- Wy, be a succession of random walks, all of which begin at a specified interior mesh point. 
Let t А t TR th be the temperatures at the boundary mesh points first encountered along each of these 


random walks. Then the average value (t | t He th ) / я of these boundary temperatures approaches 


the temperature at the specified interior mesh point as the number of random walks п increases without 
bound. 


This property is a consequence of the discrete mean-value property that the mesh point temperatures satisfy. The 
proof of the random walk property involves elementary concepts from probability theory, and we will not give it 
here. 


In Table 2 we display the results of a large number of computer-generated random walks for the evaluation of the 
temperature £5 of the nine-point mesh of case (b) in Figure 10.11.6. The first column lists the number n of the 
random walk. The second column lists the temperature tn of the boundary point first encountered along the 
corresponding random walk. The last column contains the cumulative average of the boundary temperatures 
encountered along the n random walks. Thus, after 1000 random walks we have the approximation f 5 ~ .7550. 
This compares with the exact value £5 = .7491 that we had previously evaluated. As can be seen, the convergence 
to the exact value is not too rapid. 


Table 2 


раан] pepe 
M 


1.0000 0.9500 
1.5000 : é 0.8000 
1.3333 i 0.8250 
1.0000 < 0.8400 


1.2000 X ! 0.8300 
1.0000 g 0.8000 
1.1429 - 0.8050 
1.0000 250 0.8240 
1.1111 5C 0.7860 
1.0000 0.7550 


Exercise Set 10.11 


1. A plate in the form of a circular disk has boundary temperatures of 0° on the left of its circumference and 1? on 
the right half of its circumference. A net with four interior mesh points is overlaid on the disk (see Figure 
Ex-1). 


(a) Using the discrete mean-value property, write the 4 x 4 linear system t = Aft + b that determines the 
approximate temperatures at the four interior mesh points. 


(b) Solve the linear system in part (a). 
(с) Use the Jacobi iteration scheme with t® — Q to generate the iterates «0, D, (9), tP, and t© for ће 


linear system in part (a). What is the “error vector" 6) — ү, where t is the solution found in part (b)? 


(d) By certain advanced methods, it сап be determined that the exact temperatures to four decimal places at the 
four mesh points аге ѓу = £3 = .2871 and £5 = £4 = .7 129. What are the percentage errors in the values 


found in part (b)? 
Figure Ex-1 
Answer: 
(a) l 1 
Р : 4 4 1 , 0 
1 1 
ipad 1 
£3 4 4 ||£2 2 
= + 
£3 1 9 9 1/4 0 
£4 4 4 £4 1 
0 21 0 2 
4 4 
(b) 1 
4 
3 
4 
t— 
X 
4 
E] 
4 
(c) 1 3 d 13 d. 
0 8 32 64 64 
1 2 XL 23 47 MUN 
(b. |2 à |9 (з) _ | 16 (4 _ | 32 (5) _ | 64 (5) B 64 
б == i$ == ty = tx — =—t= 
7 i| з. 3 aset д 
1 8 16 32 64 64 
2 2 п 23 47 l 
8 16 32 64 64 


(d) for £4 and t4, ~12.9%; for t4 and t4, 5.2% 


2. Use Theorem 10.11.1 to find the exact equilibrium temperature at the center of the disk in Exercise 1. 


Answer: 


1 
2 
3. Calculate the first two iterates { 1) and ¢@ for case (b) of Figure 10.11.3 with nine interior mesh points 


[Equation 2] when the initial iterate is chosen as 


«D — [1 1 1 1 1 1 1 1 1]7 


Answer: 
T 
[252542543 
444444444 
[13 18 9 22 13 7 21 16 10] 
16 16 16 16 16 16 16 16 16 


4. The random walk illustrated in Figure Ex-4a can be described by six arrows 
121 
that specify the directions of departure from the successive mesh points along the path. Figure Ex-4b is an array 
of 100 computer-generated, randomly oriented arrows arranged in a 10 x 10 array. Use these arrows to 
determine random walks to approximate the temperature £5, as in Table 2. Proceed as follows: 


1. Take the last two digits of your telephone number. Use the last digit to specify a row and the other to specify 
a column. 


2. Go to the arrow in the array with that row and column number. 


3. Using this arrow as a starting point, move through the array of arrows as you would read a book (left to right 
and top to bottom). Beginning at the point labeled £5 in Figure Ex-4a and using this sequence of arrows to 
specify a sequence of directions, move from mesh point to mesh point until you reach a boundary mesh 
point. This completes your first random walk. Record the temperature at the boundary mesh point. (If you 
reach the end of the arrow array, continue with the arrow in the upper left corner.) 


4. Return to the interior mesh point labeled £5 and begin where you left off in the arrow array; generate your 
next random walk. Repeat this process until you have completed 10 random walks and have recorded 10 
boundary temperatures. 


5. Calculate the average of the 10 boundary temperatures recorded. (The exact value is £5 = 17491.) 
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Figure Ex-4 


Section 10.11 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the relevant 
documentation for the particular utility you are using. The goal of these exercises is to provide you with a basic 
proficiency with your technology utility. Once you have mastered the techniques in these exercises, you will be 
able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. Suppose that we have the square region described by 
R-—í(xy)zxzlü0zy»zl) 


and suppose that the equilibrium temperature distribution y (x, y) along the boundary is given by u(x, 0) = Гр, 


u(x, 1) = Tr, u(0, y) = Tr, andu(1, y) = Тр. Suppose next that this region is partitioned into an 
(и + 1) x (и + 1) mesh using 


and yj-4 


x =- 


x; = 

fori = 0, 1, 2,..., мапа j = 0, 1, 2, ..., м. If the temperatures of the interior mesh points are labeled by 
иір —u(xi, yi) -uG In, jin) 
then show that 
Wig — а-ы T uipi, y 0,3-1 + tj j+) 
fori = 1, 2, 3, .., м —1and j = 1, 2, 3, ..., м — 1. To handle the boundary points, define 
моу = Тү, Wy j = Тв, ui i= Тв, апа uin = TT 
fori = 1, 2, 3,.., м = l and j = 1, 2, 5, .., & = 1. Next let 
0 4. 
Fn = E 0 | 

be the (и + 1) x (м + 1) matrix with the x x » identity matrix in the upper right-hand corner, а one in the lower 
left-hand corner, and zeros everywhere else. For example, 


010 
з= |0 J F4—|0 0 1| 
100 
0100 0 
: А я : 001200 
F4= , Fs5=|0 0010 
TAA 0000 1 
tM P 100 0 0 


and so on. By defining the (» + 1) x (м + 1) matrix 


T 
0 I Od 
Mati = Ри + Ра = | 4 à f d 


show that if U/, 4 у is the (м + 1) x (м + 1) matrix with entries "ij, then the set of equations 


1 
w= 4 i-Lg P ipl, d igl + ti j+) 


fori—-1,2,5,..,» = l and j = 1, 2, 3,.., я — 1 can be written as the matrix equation 
1 
Usi == gnt Un + Uy41Myit) 
where we consider only those elements of 0,4, with i = 1, 2, 5,.., — l and j= 1, 2, 3,.., 4 — 1. 


T2. The results of the preceding exercise and the discussion in the text suggest the following algorithm for solving 
for the equilibrium temperature in the square region 


R= {(х,у)|0<х< 1,0<у<1} 


given the boundary conditions 
u(x, 0) = Tg, u(x,l)— Тт, 


и(0,у) = Ті, u(ly)-— 
1. Choose а value for n, and then choose an initial guess, say 
O Tp Tr Uu 
Ig. ыс dE Тт 
i. =|: 
Tp 0 .. 0 Tr 
О Tp ou TR 0 


2. For each value of k = 0, 1, 2, 3,..., compute Ue using 


(kb 1 (k) (К) 
Us kl —4 Uia Un + Upp Мун) 


where Af „+1 is as defined in Exercise ТІ . Then adjust s by replacing all edge entries by the initial edge 


-> 


entries in Us. p [Note: The edge entries of a matrix are the entries in the first and last columns and first and 
last rows.] 


3. Continue this process until US = see is approximately the zero matrix. This suggests that 


k 
а= in Uf 


Use a computer and this algorithm to solve for (x, y) given that 
u(x, 0) = 0, u(x, 1) z 0, u(0, y) = 0, ull, y) =2 
Choose x = 6 and compute up to de . The exact solution can be expressed as 
8 Э sinh [ (25 = 1)zx]sin[ (25 = lny] 
(2m — 1)sinh[ (2:2 — 1)т] 
Use a computer to compute u(i / 6, j / 6) ist i, j — 0,1, 2, 3, 4, 5, 6, and then compare your results to the values 


ofu(i/6, j / 6) in po. 


u(x, y) — 


ТЗ. Using the exact solution u(x, y) for the temperature distribution described in Exercise T2 , use a graphing 
program to do the following: 


(a) Plot the surface z = x(x, y) in three-dimensional xyz-space in which z is the temperature at the point (x, у) in 
the square region. 


(b) Plot several isotherms of the temperature distribution (curves in the xy-plane over which the temperature is a 
constant). 


(c) Plot several curves of the temperature as a function of x with y held constant. 


(d) Plot several curves of the temperature as a function of y with x held constant. 
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10.12 Computed Tomography 


In this section we will see how constructing a cross-sectional view of a human body by analyzing X-ray scans leads to an inconsistent linear 
system. We present an iteration technique that provides an "approximate solution" of the linear system. 


Prerequisites 


Linear Systems 
Natural Logarithms 
Euclidean Space 2” 


The basic problem of computed tomography is to construct an image of a cross section of the human body using data collected from many 
individual beams of X rays that are passed through the cross section. These data are processed by a computer, and the computed cross section is 
displayed on a video monitor. Figure 10.12.1 is a diagram of General Electric's CT system showing a patient prepared to have a cross section of 
his head scanned by X-ray beams. 


Figure 10.12.1 


Such a system is also known as a CAT scanner, for Computer-Aided Tomography scanner. Figure 10.12.2 shows a typical cross section of a 
human head produced by the system. 


Figure 10.12.2 


The first commercial system of computed tomography for medical use was developed in 1971 by G. N. Hounsfield of EMI, Ltd., in England. In 
1979, Houndsfield and A. M. Cormack were awarded the Nobel Prize for their pioneering work in the field. As we will see in this section, the 
construction of a cross section, or tomograph, requires the solution of a large linear system of equations. Certain algorithms, called algebraic 
reconstruction techniques (ARTS), can be used to solve these linear systems, whose solutions yield the cross sections in digital form. 


Scanning Modes 


Unlike conventional X-ray pictures that are formed by X rays that are projected perpendicular to the plane of the picture, tomographs are 
constructed from thousands of individual, hairline-thin X-ray beams that /ie in the plane of the cross section. After they pass through the cross 
section, the intensities of the X-ray beams are measured by an X-ray detector, and these measurements are relayed to a computer where they are 


processed. Figures 10.12.3 and 10.12.4 illustrate two possible modes of scanning the cross section: the parallel mode and the fan-beam mode. 
In the parallel mode a single X-ray source and X-ray detector pair are translated across the field of view containing the cross section, and many 
measurements of the parallel beams are recorded. Then the source and detector pair are rotated through a small angle, and another set of 
measurements is taken. This is repeated until the desired number of beam measurements is completed. For example, in the original 1971 
machine, 160 parallel measurements were taken through 180 angles spaced 1? apart: a total of 160 x 180 — 28, 800 beam measurements. Each 


such scan took approximately 55 minutes. 


X-ray 
detector 


source 


Figure 10.12.3 


Figure 10.12.4 


In the fan-beam mode of scanning, a single X-ray tube generates a fan of collimated beams whose intensities are measured simultaneously by 
an array of detectors on the other side of the field of view. The X-ray tube and detector array are rotated through many angles, and a set of 
measurements is taken at each angle until the scan is completed. In the General Electric CT system, which uses the fan-beam mode, each scan 
takes 1 second. 


Derivation of Equations 


To see how the cross section is reconstructed from the many individual beam measurements, refer to Figure 10.12.5. Here the field of view in 
which the cross section is situated has been divided into many square pixels (picture elements) numbered 1 through N as indicated. It is our 
desire to determine the X-ray density of each pixel. In the EMI system, 6400 pixels were used, arranged in a square 80 x 80 array. The G.E. CT 
system uses 262,144 pixels in a 512 x 512 array, each pixel being about 1 mm on a side. After the densities of the pixels are determined by the 
method we will describe, they are reproduced on a video monitor, with each pixel shaded a level of gray proportional to its X-ray density. 
Because different tissues within the human body have different X-ray densities, the video display clearly distinguishes the various tissues and 
organs within the cross section. 


X-ray 
detector 


pixel 
X-ray jth 
source pixel 


Figure 10.12.5 


Figure 10.12.6 shows a single pixel with an X-ray beam of roughly the same width as the pixel passing squarely through it. The photons 
constituting the X-ray beam are absorbed by the tissue within the pixel at a rate proportional to the X-ray density of the tissue. Quantitatively, 
the X-ray density of the jth pixel is denoted by * ; and is defined by 

number of photons entering the jth pixel 

number of photons leaving the jth pixel 
where “In” denotes the natural logarithmic function. Using the logarithm property In(a / b) = — In(5 / а), we also have 

Jraction of photons that pass through 
х,= = 
! the jth pixel without being absorbed 


Photons entering Photons leaving 
ith pixel jth pixel 


Figure 10.12.6 


If the X-ray beam passes through an entire row of pixels (Figure 10.12.7), then the number of photons leaving one pixel is equal to the number 
of photons entering the next pixel in the row. If the pixels are numbered 1, 2, ..., #, then the additive property of the logarithmic function gives 


—— 2 number of photons entering the first pixel | 


number of photons leaving the nth pixel 


Jraction of photons that pass (1) 
= —]|n| through the row of n pixels 
without being absorbed 


Thus, to determine the total X-ray density of a row of pixels, we simply sum the individual pixel densities. 


-> 


Photons entering Photons leaving 
first pixel nth pixel 


First | Second | Third nth 
pixel pixel pixel pixel 


Figure 10.12.7 


Next, consider the X-ray beam in Figure 10.12.5. By the beam density of the ith beam of a scan, denoted by b;, we mean 


number of photons of the ith beam entering the detector 


without the cross section in the field of view 
number of photons of the ith beam entering the detector 


with the cross section in the field of view Q) 


Jraction of photons of the ith beam that 
= —]n| pass through the cross section without 
being absorbed 
The numerator in the first expression for b; is obtained by performing a calibration scan without the cross section in the field of view. The 


resulting detector measurements are stored within the computer's memory. Then a clinical scan is performed with the cross section in the field 
of view, the 5;'s of all the beams constituting the scan are computed, and the values are stored for further processing. 


For each beam that passes squarely through a row of pixels, we must have 


Jraction of photons of the Jraction of photons of the 
beam that pass through the | | | beam that pass through the 
row of pixels without being — |eross section without being 

absorbed absorbed 


Thus, if the ith beam passes squarely through a row оЁ п pixels, then it follows from Equations 1 and 2 that 

ХФ х2 +... ху бу 
In this equation, b; is known from the clinical and calibration measurements, and x1, x2, ..., ху are unknown pixel densities that must be 
determined. 


More generally, if the ith beam passes squarely through a row (or column) of pixels with numbers j1, j2, ..., Jj, then we have 
Xj Xj +b xy, 2; 


If we set 


p a[b Pied dedi 
y 10, otherwise 


then we can write this equation as 
аху + арха +... aj yx y = bi (3) 
We will refer to Equation 3 as the ith beam equation. 


Referring to Figure 10.12.5, however, we see that the beams of a scan do not necessarily pass through a row or column of pixels squarely. 
Instead, a typical beam passes diagonally through each pixel in its path. There are many ways to take this into account. In Figure 10.12.8 we 
outline three methods of defining the quantities 4j that appear in Equation 3, each of which reduces to our previous definition when the beam 
passes squarely through a row or column of pixels. Reading down the figure, each method is more exact than its predecessor, but with 
successively more computational difficulty. 


Center-of-Pixel Method 


ith beam 
if the ith beam passes 
through the center of 
the jth pixel 


otherwise 
jth pixel 


Center Line Method 


Length of 
center line 
length of the center line 


of the ith beam that lies 
in the jth pixel 
width of the jth pixel 


H-T width of 


pixel 
Area Method 


Area in the Arca in dh 
numerator of a;; Area m the 
i) 


area of the ith beam that lies in the jth pixel denominator of а, 
arca of the ith beam that would lie in the jth pixel 
if the ith beam were to cross the pixel squarely [=P 


Figure 10.12.8 


Using any one of the three methods to define the 2;/' in the ith beam equation, we can write the set of M beam equations in a complete scan as 


ayxi + рхо +...+ @inxn = — 
ахі + 42332 ++ a3NXN = 43 (4) 
aixi + amx +..+ aMNXN = bw 


In this way we have a linear system of M equations (the M beam equations) in N unknowns (the N pixel densities). 


Depending on the number of beams and pixels used, we can have M > N, M = N, or M < N. We will consider only the case M = NW. the 
so-called overdetermined case, in which there are more beams in the scan than pixels in the field of view. Because of inherent modeling and 
experimental errors in the problem, we should not expect our linear system to have an exact mathematical solution for the pixel densities. In the 
next section we attempt to find an "approximate" solution to this linear system. 


Algebraic Reconstruction Techniques 


There have been many mathematical algorithms devised to treat the overdetermined linear system 4. The one we will describe belongs to the 
class of so-called Algebraic Reconstruction Techniques (ARTs). This method, which can be traced to an iterative technique originally 
introduced by S. Kaczmarz in 1937, was the one used in the first commercial machine. To introduce this technique, consider the following 
system of three equations in two unknowns: 


Lp xp + x2 = 2 
L3 xy = 2x3 = —2 (5) 
La 3x, = xXx = 3 


The lines Ау, £3, £3 determined by these three equations are plotted in the х 1х2-рІапе. As shown in Figure 10.12.9a, the three lines do not have 
a common intersection, and so the three equations do not have an exact solution. However, the points (x1, x2) on the shaded triangle formed by 
the three lines are all situated “near” these three lines and can be thought of as constituting *approximate" solutions to our system. The 
following iterative procedure describes a geometric construction for generating points on the boundary of that triangular region (Figure 
10.12.95): 


Algorithm 1 

Step 0 Choose an arbitrary starting point хү in the x 1x2-plane. 

Step 1 Project Xj orthogonally onto the first line д у and call the projection xD. The superscript 1 indicates that this is the first of several 
cycles through the steps. 

Step 2 Project х0 orthogonally onto the second line Z3 and call the projection x». 


Step 3 Project х0 orthogonally onto the third line 7,3 and call the projection x). 


Step 4 Take х0 as the new value of Xg and cycle through Steps 1 through 3 again. In the second cycle, label the projected points x”, х2, 


х2; in the third cycle, label the projected points x, x5. х9; and so forth. 


This algorithm generates three sequences of points 


Li x? X x 
L xD xD x8, 
L «© xD xo, 


that lie on the three lines 7,1, £3, and £3, respectively. It can be shown that as long as the three lines are not all parallel, then the first sequence 
converges to a point xj on £4, the second sequence converges to a point x; on 2,2, and the third sequence converges to a point хз on £3 (Figure 
10.12.9c). These three limit points form what is called the limit cycle of the iterative process. It can be shown that the limit cycle is independent 
of the starting point Xp. 


Figure 10.12.9 


Next we discuss the specific formulas needed to effect the orthogonal projections in Algorithm 1. First, because the equation of a line in x1x2 


-space is 
axy Hax =È 


we can express it in vector form as 


where 


ay X1 
Elo =f 


The following theorem gives the necessary projection formula (Exercise 5). 


THEOREM 10.12.1 Orthogonal Projection Formula 


Let L be а line in 22 with equation аїх = b, and let x" Бе any point іп 22 (Figure 10.12.10). Then the orthogonal projection, Xp, of 
x onto L is given by 


T * 
_ё-ах) 


а?а 


Figure 10.12.10 


EXAMPLE 1 Using Algorithm 1 < 


We can use Algorithm 1 to find an approximate solution of the linear system given in 5 and illustrated in Figure 10.12.9. If we 
write the equations of the three lines as 


Li: afx—h 

L3: аїх = 3 

L3: ах = Ь; 
where 

i= x] a= 1 — 1 = 3 
mE 11| р сы 
then, using Theorem 10.12.1, we can express the iteration scheme in Algorithm 1 as 
T P) 
br = 
P LX» | e ар ы a, k—1,2,3 
а; ад 


where р = 1 for the first cycle of iterates, р = 2 for the second cycle of iterates, and so forth. After each cycle of iterates (i.e., 
after х?) is computed), the next cycle of iterates is begun with get) set equal to x. 


Table 1 gives the numerical results of six cycles of iterations starting with the initial point хо = (1, 3). 


Table 1 


1.00000 | 3.00000 


.00000 | 2.00000 
40000 | 1.20000 
1.30000 -90000 


ж? 


1.20000 -80000 
88000 | 1.44000 
1.42000 .26000 
1.08000 92000 
83200 41600 
1.40800 22400 


1.09200 90800 
83680 | 1.41840 
1.40920 | 1.22760 


1.09080 90920 
83632 | 1.41816 
1.40908 22724 


1.09092 -90908 
.83637 | 1.41818 
1.40909 22728 


Using certain techniques that are impractical for large linear systems, we can show the exact values of the points of the limit cycle 
in this example to be 


* _ {12 10\_ 
xj - (8, t= (1.09090... 90909...) 
x} = (5. 18) = (83636... 1.41818...) 


x; = (2, 52) = (1.40908... 1.22727...) 


It can be seen that the sixth cycle of iterates provides an excellent approximation to the limit cycle. Any one of the three iterates 
©, (9, © can be used as an approximate solution of the linear system. (The large discrepancies in the values of x, хо, and 


x 
E are due to the artificial nature of this illustrative example. In practical problems, these discrepancies would be much smaller. 


To generalize Algorithm 1 so that it applies to an overdetermined system of M equations in N unknowns, 


ацхр + ах) ++ away = b 

anaxy + @2х) ++ a3NXN = 43 © 
aixi + амхә +..+ aMNXN = bM 

we introduce column vectors x and а; as follows: 
X1 ат 
х а; 
х= 2 : а; = п . і= 1,2,... M 
XN ату 


With these vectors, the M equations constituting our linear system 6 can be written in vector form as 

af x— bi, i-1,2,.M 
Each of these M equations defines what is called a hyperplane in the N-dimensional Euclidean space 2. In general these M hyperplanes have 
no common intersection, and so we seek instead some point in RV that is reasonably “close” to all of them. Such a point will constitute an 


approximate solution of the linear system, and its N entries will determine approximate pixel densities with which to form the desired cross 
section. 


As in the two-dimensional case, we will introduce an iterative process that generates cycles of successive orthogonal projections onto the М 
hyperplanes beginning with some arbitrary initial point in RW. Our notation for these successive iterates is 


(р) the iterate lying on the kth hyperplane 
X; = : К " 
generated during the pth cycle of iterations 


The algorithm is as follows: 
Algorithm 2 
Step 0 Choose any point in R¥ and label it xp. 
Step 1 For the first cycle of iterates, set p — 1. 
Step 2 Fork = 1, 2,..., M, compute 
тү?) 


bg =a 

(р (р к Xk—1 

XP LX» | (bk T ak 
aj ak 


(р+1) _ (р) 
Step 3 Set xo Li 
Step 4 Increase the cycle number p by 1 and return to Step 2. 


In Step 2 the iterate x? is called the orthogonal projection of х2 i onto the hyperplane aix = Бу. Consequently, as in the two-dimensional 


case, this algorithm determines a sequence of orthogonal projections from one hyperplane onto the next in which we cycle back to the first 
hyperplane after each projection onto the last hyperplane. 


It can be shown that if the vectors а], аз, ..., ауу span RŪ, then the iterates xD, х9, X .. lying on the Mth hyperplane will converge to a 


point XM on that hyperplane which does not depend on the choice of the initial point хо. In computed tomography, one of the iterates x2 forp 


sufficiently large is taken as an approximate solution of the linear system for the pixel densities. 


Note that for the center-of-pixel method, the scalar quantity al aj, appearing in the equation in Step 2 of the algorithm is simply the number of 
pixels in which the Ath beam passes through the center. Similarly, note that the scalar quantity 


bk = ax, 


in that same equation can be interpreted as the excess kth beam density that results if the pixel densities are set equal to the entries of x? U This 
provides the following interpretation of our ART iteration scheme for the center-of-pixel method: Generate the pixel densities of each iterate by 


distributing the excess beam density of successive beams in the scan evenly among those pixels in which the beam passes through the center. 
When the last beam in the scan has been reached, return to the first beam and continue. 


EXAMPLE 2 Using Algorithm 2. < 


We can use Algorithm 2 to find the unknown pixel densities of the 9 pixels arranged in the 3 x 3 array illustrated in Figure 
10.12.11. These 9 pixels are scanned using the parallel mode with 12 beams whose measured beam densities are indicated in the 
figure. We choose the center-of-pixel method to set up the 12 beam equations. (In Exercises 7 and 8, you are asked to set up the 
beam equations using the center line and area methods.) As you can verify, the beam equations are 


x7+xg+xg = 13.00 хз х хә = 18.00 
хаФ+ х5 х6 = 15.00 X2--X5-Fxg = 12.00 
xi +x2+x3 = 8.00 xi +x4+x7 = 6.00 
xg +xg+ xo = 14.79 X2-Fx3-Fxg = 10.51 
x3+x5+x7 = 14.31 хх хә = 16.13 
х\+х2+х4 = 3.81 ха х7 х8 = 7.04 


Table 2 illustrates the results of the iteration scheme starting with an initial хо = 0. The table gives the values of each of the first 
cycle of iterates, x through хр, but thereafter gives the iterates x only for various values of p. The iterates xD start 


repeating to two decimal places for p > 45, and so we take the entries of XD as approximate values of the 9 pixel densities. 


bs = 12.00 


by =6.00 by = 18,00 bio = 10.51 
by) = 16.13 


b, =8.00 


Figure 10.12.11 
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We close this section by noting that the field of computed tomography is presently a very active research area. In fact, the ART scheme 
discussed here has been replaced in commercial systems by more sophisticated techniques that are faster and provide a more accurate view of 
the cross section. However, all the new techniques address the same basic mathematical problem: finding a good approximate solution of a 
large overdetermined inconsistent linear system of equations. 


Exercise Set 10.12 


L (а) Setting x” (х), x, show that the three projection equations 
T» 
xP) =, ва) a ' k=1,2,3 


for the three lines in Equation 5 can be written as 


х= 200 – xP} 
k=1 
d -lp-dp х0] 
xp = 10-240) 20007] 
&=2: 
х@) = dla 20 +m] 
x = (9+ х9) +350) 1 
k=3: 
ea 34 330 4.922) ] 
where (P +0 1*9 = ag p) x2) for p — 1, 2,. 
(b) Show that the three pairs of equations in part (a) can be combined to produce 
-1 -1 
е 2,87 
_ р==1 2,5 
xy = 024309 -*5 7] 


where aO. xB хуу) = (х0), х0) == xD. Ж es this pair of equations, we can perform one complete cycle of three orthogonal 
projections in a single step.] 
(c) Because х2) tends to the limit point x; as P — 09, the equations in part (b) become 
хз = = 3 (28 + xz = хз] 
* * 
ko = 35 (24 + 3x31 = 3x33] 


as P — co. Solve this linear system for X; = (хз 2 X32): [Note: The simplifications of the ART formulas described in this exercise are 
impractical for the large linear systems that arise in realistic computed tomography problems.] 


Answer: 


* 31 27 
© x= (21. 2 


2. Use the result of Exercise 1(b) to find xD x? 4 х to five decimal places іп Example 1 using the following initial points: 
(a) хо = (0, 0) 


(b) Xo = (1, 1) 
(c) хо = (148, — 15) 


Answer: 


(а) xP = (1.40000, 1.20000) 
x® = (1.41000, 1.23000) 
x?? = (1.40900, 1.22700) 

? — (1.40910, 1.22730) 
х = (1.40909, 1.22727) 


© = (1.40908, 1.22727) 
(b) Same as part (a) 


(©) x = (9.55000, 25.65000) 
x® = (.59500, — 1.21500) 
х = (1.49050, 1.47150) 
x$? = (1.40095, 1.20285) 
х? = (1.40991, 1.22972) 
x? = (1.40901, 1.22703) 


3. (a) Show directly that the points of the limit cycle in Example 1, 
0510 * _ (46 78 opla 
ЛҮ! 55° 557 3 \22° 22 
form a triangle whose vertices lie on the lines Д у, £3, and £3 and whose sides are perpendicular to these lines (Figure 10.12.9c). 


(b) Using the equations derived in Exercise 1(a), show that itx zz х; E D * 25 
1 


ә ж 2 10 
P-a) 
2 55' 55 
HE. oe {31 27 
ч Жын: = (5:5) 


[Note: Either part of this exercise shows that successive orthogonal projections of any point on the limit cycle will move around the 
limit cycle indefinitely. ] 


4. The following three lines in the X 1%3-plane, 
fy: х2=1 
Буу xq—2x3—2 
Їз xQ—x3-—0 
do not have a common intersection. Draw an accurate sketch of the three lines and graphically perform several cycles of the orthogonal 
projections described in Algorithm 1, beginning with the initial point xg = (0, 0). On the basis of your sketch, determine the three points of 
the limit cycle. 


Answer: 


* * * 
Xi = (1, 1» Xj = (2, 0). X5 = (1, 1) 
5. Prove Theorem 10.12.1 by verifying that 
(a) the point Xp as defined in the theorem lies on the line ax =b Че. ax; =b). 


(b) the vector Xy — x is orthogonal to the line 474 р (i.e., Xp — x is parallel to a). 


6. As stated in the text, the iterates х0, x х, ___ defined in Algorithm 2 will converge to a unique limit point zu if the vectors 
aj, аз, ..., ауу Span В. Show that if this is the case and if the center-of-pixel method is used, then the center of each of the N pixels in the 
field of view is crossed by at least one of the M beams in the scan. 


7. Construct the 12 beam equations in Example 2 using the center line method. Assume that the distance between the center lines of adjacent 
beams is equal to the width of a single pixel. 


Answer: 


x7 + xg + хо = 13.00 

x4+ X5-- xg = 15.00 

x1 X22 х3 = 8.00 

.82843(х6 + xg) + .58579хо = 14.79 
1.41421(x3 + x5 + x7) = 14.31 
.82843(x3 + x4) + .58579х1 = 3.81 
x3 + х6 + хо = 18.00 

x2 + х5 + xg = 12.00 

Xp +xg+x7 = 6.00 

.82843(х2 + xg) + .58579х3 = 10.51 
1.41421(x, + х5 + х9) = 16.13 
.82843(x4 + xg) + .58579х7 = 7.04 


8. Construct the 12 beam equations in Example 2 using the area method. Assume that the width of each beam is equal to the width of a single 
pixel and that the distance between the center lines of adjacent beams is also equal to the width of a single pixel. 


Answer: 


x7 + xg + x9 = 13.00 

x4+ X5-- xg = 15.00 

xi #x2+x3= 8.00 

04289 (x4 + x5 + x5) +.75000(x6 + xg) +.61396x9 = 14.79 
.91421 (x3 + х5 + x5) +.25000(x2 + x4 + x6 + xg) = 14.31 
.04289 (x3 + x5 + x7) +.75000 (x2 + x4) +.61396x1 = 3.81 
x3 + x6 + хо = 18.00 

хо + x5 + xg = 12.00 

xjdx44-x;-— 6.00 

.04289 (x4 + х5 + x9) +.75000(x2 + xg) +.61396x3 = 10.51 
.91421 (x1 + х5 + x9) +.25000(x2 + x4-- xg +xg) = 16.13 
.04289 (x + x5 + хо) +.75000(x4 + xg) +.61396x7 = 7.04 


Section 10.12 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, Maple, Derive, or 
Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some linear algebra capabilities. For each 
exercise you will need to read the relevant documentation for the particular utility you are using. The goal of these exercises is to provide you 
with a basic proficiency with your technology utility. Once you have mastered the techniques in these exercises, you will be able to use your 
technology utility to solve many of the problems in the regular exercise sets. 


T1. Given the set of equations 
арх + bky —Ck 
fork = 1, 2, 3, ..., (with x = 2), let us consider the following algorithm for obtaining an approximate solution to the system. 


1. Solve all possible pairs of equations 


ах + biy = с; and ajx by =e; 
for i, j = 1, 2, 3,..., n and i < j for their unique solutions. This leads to 
1 
=и(и = 1 
оп ) 
solutions, which we label as 
(ху, Yay) 


for i, j = 1, 2, 3, .., n and i £ j. 


2. Construct the geometric center of these points defined by 


[2 R&A 0 02 | » 
(xe, ус) = 35 У Xi. > 55 Уту 


и(и = 1) i=lj=i+1 nía = 1) i=lj=i+1 


and use this as the approximate solution to the original system. 


Use this algorithm to approximate the solution to the system 


х+ y= 2 
х—=2у=—2 
3x= y= 3 


and compare your results to those in this section. 
T2. (Calculus required) Given the set of equations 

арх + bky —Ck 
fork = 1, 2, 3, ..., я (with у; 2), let us consider the following least squares algorithm for obtaining an approximate solution (x^, у") to the 
system. Given a point (a, 3) and the line g;x + b;y = су, the distance from this point to the line is given by 

aja + 538 — cj 

Va? + b? 
If we define a function 7 (x, y) by 
уау) = aeo 
ї=1 ay +; 

and then determine the point (x Ы y D that minimizes this function, we will determine the point that is closest to each of these lines in a 


* ; 
summed least squares sense. Show that x" and у are solutions to the system 


n a2 * n ab, * n Qu 
mcd tea in 
i=l a; +b; i=l a; +b; i=l аг + b 


and 


i-l a? +b? i-l a? +? i=l a? +b? 


Apply this algorithm to the system 


x+ y= 2 
x= 2у = —2 
3x= y= 3 


and compare your results to those in this section. 
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10.13 Fractals 


In this section we will use certain classes of linear transformations to describe and generate intricate sets in the Euclidean plane. These sets, called fractals, are 
currently the focus of much mathematical and scientific research. 


Prerequisites 


Geometry of Linear Operators оп g? (Section 4.11) 
Euclidean Space 2” 

Natural Logarithms 

Intuitive Understanding of Limits 


Fractals in the Euclidean Plane 


At the end of the nineteenth century and the beginning of the twentieth century, various bizarre and wild sets of points in the Euclidean plane began appearing in 
mathematics. Although they were initially mathematical curiosities, these sets, called fractals, are rapidly growing in importance. It is now recognized that they reveal 


a regularity in physical and biological phenomena previously dismissed as “random,” “noisy,” or “chaotic.” For example, fractals are all around us in the shapes of 


clouds, mountains, coastlines, trees, and ferns. 


In this section we give a brief description of certain types of fractals in the Euclidean plane 22. Much of this description is an outgrowth of the work of two 
mathematicians, Benoit B. Mandelbrot and Michael Barnsley, who are both active researchers in the field. 


Self-Similar Sets 


To begin our study of fractals, we need to introduce some terminology about sets in д2. We will call a set in 52 bounded if it can be enclosed by a suitably large 
circle (Figure 10.13.1) and closed if it contains all of its boundary points (Figure 10.13.2). Two sets in 52 will be called congruent if they can be made to coincide 
exactly by translating and rotating them appropriately within 22 (Figure 10.13.3). We will also rely on your intuitive concept of overlapping and nonoverlapping 
sets, as illustrated in Figure 10.13.4. 


Enclosing y 
circle 


Unbounded set 


1 


(a) Set enclosed by a circle (5) This set cannot be 
enclosed by any circle. 


Figure 10.13.1 


Figure 10.13.2 The boundary points (solid color) lie in the set. 


Congruent sets 


Figure 10.13.3 


(a) Overlapping sets 


x 


(b) Nonoverlapping sets 


Figure 10.13.4 


If T- д2 — R? is the linear operator that scales by a factor of s (see Table 7 of Section 4.9), and if О is a set in д2, then the set F(Q) (the set of images of points in О 
under Т) is called a dilation of the set О if > | and a contraction of Q if () = s < | (Figure 10.13.5). In either case we say that T'(Q) is the set О scaled by the factor 
s. 


nj > 


Figure 10.13.5 A contraction of О. 


The types of fractals we will consider first are called self-similar. In general, we define a self-similar set in 22 as follows: 


DEFINITION 1 


A closed and bounded subset of the Euclidean plane g? is said to be self-similar if it can be expressed in the form 
S= 8), US2US3U...US_ (1) 


where 51, S5, 53, ..., Sy are nonoverlapping sets, each of which is congruent to S scaled by the same factor s (0 < s < 1). 


If 5 is a self-similar set, then 1 is sometimes called a decomposition of S into nonoverlapping congruent sets. 
EXAMPLE 1 Line Segment + 


A line segment in 22 (Figure 10.13.6a) can be expressed as the union of two nonoverlapping congruent line segments (Figure 10.13.65). In Figure 
10.13.65 we have separated the two line segments slightly so that they can be seen more easily. Each of these two smaller line segments is congruent to 
1 


the original line segment scaled by a factor of 1. Hence, а line segment is a self-similar set with i; — 2 and = 5° 
SSS 


(a) 


== 
(5) 
Figure 10.13.6 


EXAMPLE 2 Square < 


A square (Figure 10.13.7a) can be expressed as the union of four nonoverlapping congruent squares (Figure 10.13.75), where we have again separated 
the smaller squares slightly. Each of the four smaller squares is congruent to the original square scaled by a factor of = Hence, a square is a self-similar 


set with c — 4 and s = 2. 


Figure 10.13.7 


EXAMPLE 3 Sierpinski Carpet + 


The set suggested by Figure 10.13.82, the Sierpinski “carpet,” was first described by the Polish mathematician Waclaw Sierpinski (1882—1969). It can 
be expressed as the union of eight nonoverlapping congruent subsets (Figure 10.13.85), each of which is congruent to the original set scaled by a factor 


of 1. Hence, it is a self-similar set with  — 8 and s = T Note that the intricate square-within-a-square pattern continues forever on a smaller and 


smaller scale (although this can only be suggested in a figure such as the one shown). 


(a) (b) 


Figure 10.13.8 


EXAMPLE 4 Sierpinski Triangle + 


Figure 10.13.9a illustrates another set described by Sierpinski. It is a self-similar set with ; = 3 and s = i (Figure 10.13.95). As with the Sierpinski 


carpet, the intricate triangle-within-a-triangle pattern continues forever on a smaller and smaller scale. 


NS е b. 
RAS Е i 
чеч94 
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Figure 10.13.9 


The Sierpinski carpet and triangle have a more intricate structure than the line segment and the square in that they exhibit a pattern that is repeated indefinitely. This 
difference will be explored later in this section. 


Topological Dimension of a Set 


In Section 4.5 we defined the dimension of a subspace of a vector space to be the number of vectors in a basis, and we found that definition to coincide with our 
intuitive sense of dimension. For example, the origin of 22 is zero-dimensional, lines through the origin are one-dimensional, and д2 itself is two-dimensional. This 
definition of dimension is a special case of a more general concept called topological dimension, which is applicable to sets in R” that are not necessarily subspaces. 
A precise definition of this concept is studied in a branch of mathematics called topology. Although that definition is beyond the scope of this text, we can state 
informally that 


* a point in 22 has topological dimension zero; 

* a curve in д2 has topological dimension опе; 

* a region in д2 has topological dimension two. 

It can be proved that the topological dimension of a set in R” must be an integer between 0 and n, inclusive. In this text we will denote the topological dimension of a 


set S by d r(S). 


EXAMPLE 5 Topological Dimensions of Sets + 


Table 1 gives the topological dimensions of the sets studied in our earlier examples. The first two results in this table are intuitively obvious; however, 
the last two are not. Informally stated, the Sierpinski carpet and triangle both contain so many “holes” that those sets resemble web-like networks of 
lines rather than regions. Hence they have topological dimension one. The proofs are quite difficult. 


Table 1 


teem 


Sierpinski carpet EE 
Sierpinski triangle EH 


Hausdorff Dimension of a Self-Similar Set 


In 1919 the German mathematician Felix Hausdorff (1868—1942) gave an alternative definition for the dimension of an arbitrary set in &". His definition is quite 
complicated, but for a self-similar set, it reduces to something rather simple: 


DEFINITION 1 


The Hausdorff dimension of a self-similar set S of form 1 is denoted by d (5) and is defined by 


_ dk 
ан) = 175 (2) 


In this definition, “In” denotes the natural logarithm function. Equation 2 can also be expressed as 


8495 — i (3) 


in which the Hausdorff dimension d (S) appears as an exponent. Formula 3 is more helpful for interpreting the concept of Hausdorff dimension; it states, for 


1 


d 
example, that if you scale a self-similar set by a factor of s — > then its area (or more properly its measure) decreases by a factor of (2) HO) Thus; scaling а Ийе 


1 
segment by a factor of i reduces its measure (length) by a factor of (2) = i and scaling a square region by a factor of i reduces its measure (area) by a factor of 


= 


Before proceeding to some examples, we should note а few facts about the Hausdorff dimension of a set: 
* The topological dimension and Hausdorff dimension of a set need not be the same. 
* The Hausdorff dimension of a set need not be an integer. 


* The topological dimension of a set is less than or equal to its Hausdorff dimension; that is, d r(S) < d н(5). 


EXAMPLE 6 Hausdorff Dimensions of Sets + 


Table 2 lists the Hausdorff dimensions of the sets studied in our earlier examples. 


Table 2 


— : Ink 
Set $ 45) = In (1/5) 


СИПЕТЕ 


Sierpinski carpet In 8/In 3 = 1.892... 


Sierpinski triangle In 3/In 2 = 1.584... 


Fractals 
Comparing Tables 1 and 2, we see that the Hausdorff and topological dimensions are equal for both the line segment and square but are unequal for the Sierpinski 


carpet and triangle. In 1977 Benoit B. Mandelbrot suggested that sets for which the topological and Hausdorff dimensions differ must be quite complicated (as 
Hausdorff had earlier suggested in 1919). Mandelbrot proposed calling such sets fractals, and he offered the following definition. 


DEFINITION 3 


A fractal is a subset of a Euclidean space whose Hausdorff dimension and topological dimension are not equal. 


According to thisdefinition, the Sierpinski carpet and Sierpinski triangle are fractals, whereas the line segment and square are not. 


It follows from the preceding definition that a set whose Hausdorff dimension is not an integer must be a fractal (why?). However, we will see later that the converse 
is not true; that is, it is possible for a fractal to have an integer Hausdorff dimension. 


Similitudes 


We will now show how some techniques from linear algebra can be used to generate fractals. This linear algebra approach also leads to algorithms that can be 
exploited to draw fractals on a computer. We begin with a definition. 


DEFINITION 4 


A similitude with scale factor s is a mapping of #2 into R? of the form 


т) о 105+ [4] 


where s, 0, е, and / аге scalars. 


Geometrically, a similitude is a composition of three simpler mappings: a scaling by a factor of s, a rotation about the origin through an angle 0, and a translation (e 
units in the x-direction and f'units in the y-direction). Figure 10.13.10 illustrates the effect of a similitude on the unit square U. 


(& (Sealing 


5 
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Я (Rotation) 


(Translation) 
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(0, 0) (1, 0) 


(a) Unit square (b) Unit square 
after similitude 


Figure 10.13.10 


For our application to fractals, we will need only similitudes that are contractions, by which we mean that the scale factor s is restricted to the range () < = < 1. 
Consequently, when we refer to similitudes we will always mean similitudes subject to this restriction. 


Similitudes are important in the study of fractals because of the following fact: 


Іт: R? — R? is a similitude with scale factor s and if S is a closed and bounded set in R?, then the image T(S) of the set S under T is congruent to S scaled 
by s. 


Recall from the definition of a self-similar set in R? that a closed and bounded set 5 іп 27 is self-similar if it can be expressed in the form 
S-81U89U84U...U Sk 


where 57, S5, S5, ..., Sk are nonoverlapping sets each of which is congruent to 5 scaled by the same factor s (0 < s < 1) [see 1]. In the following examples, we will 
find similitudes that produce the sets 51, S5, S5, ..., Sj from S for the line segment, square, Sierpinski carpet, and Sierpinski triangle. 


EXAMPLE 7 Line Segment + 


We will take as our line segment the line segment < connecting the points (0, 0) and (1, 0) in the xy-plane (Figure 10.13.11a). Consider the two 
similitudes 
x 1/1 Oj[x 
ДЕ - 35 16] 
4 
Tal| 5 ip op Е 
Ly 2|0 1|"; |G 


both of which have s — i and j = Q. In Figure 10.13.115 we show how these two similitudes map the unit square U. The similitude 7, maps U onto 


the smaller square 7; (27), and the similitude 73 maps U onto the smaller square 72(U). At the same time, 7 maps the line segment 5 onto the 
smaller line segment 7; (S), and 75 maps S onto the smaller nonoverlapping line segment 75 (5). The union of these two smaller nonoverlapping line 
segments is precisely the original line segment S; that is, 


S= T1 (5) U T4) 6) 


EXAMPLE 8 Square < 


Let us consider the unit square U in the xy-plane (Figure 10.13.12a) and the following four similitudes, all having s = 


10 
0 1 
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Figure 10.13.11 
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The images of the unit square U under these four similitudes are the four squares shown in Figure 10.13.125. Thus, 


U-TQ(U)UTQ4U)UT4U)UTAU) 


=andg=0): 

1 
2n 1+ 2 
210 1[1У 0 

1 (6) 
1110 [х 2 
AE l+ 1 
2 

(7) 


is a decomposition of U into four nonoverlapping squares that are congruent to U scaled by the same scale factor | = 2 


(0, 1) 


(0, 0) 


(0. 1) 


(0, 0) 
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(2.0) 
(5) 


(1, 0) 
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Figure 10.13.12 


EXAMPLE 9 Sierpinski Carpet + 


Let us consider a Sierpinski carpet S over the unit square U of the xy-plane (Figure 10.13.13a) and the following eight similitudes, all having s — i and 


920: 
2i 
pIE mens : 


e; 
where the eight values of | H are 


1 


H 


The images of S under these eight similitudes are the eight sets shown in Figure 10.13.132. Thus, 


Wife un» 
WIN © 
WIN bale 
WIPO шоко 


S= 7 (8) U 72(8) U 73(8) U...U Tg(S) (9) 


is a decomposition of S into eight nonoverlapping sets that are congruent to S scaled by the same scale factor ( = H 


(0.1) 


(0,0) 


Figure 10.13.13 


EXAMPLE 10 Sierpinski Triangle + 


Let us consider a Sierpinski triangle S fitted inside the unit square U of the xy-plane, as shown in Figure 10.13.14a, and the following three similitudes, 


all having s — i and 0 = 0: 


req J 1 

x 111 Oj[x = 

7. = = -|2 
1) Е THE 0 (10) 

E 7 0 

x — 1 1 Olfx 
ДЕ ЕНЕР 
Soc 4 2 
The images of S under these three similitudes are the three sets in Figure 10.13.145. Thus, 

S= T1 (8) U T2(8) U 73(8) (11) 


is a decomposition of S into three nonoverlapping sets that are congruent to S scaled by the same scale factor | = i 


(0, 0) 


Figure 10.13.14 


In the preceding examples we started with a specific set 5 and showed that it was self-similar by finding similitudes T, 72, T3, ..., Tj, with the same scale factor 
such that 7; (S), 72(5), T3(5), -- TX (S) were nonoverlapping sets and such that 


S= 7108) U T4(S) U 73(S) U...U T&(S) (12) 


The following theorem addresses the converse problem of determining a self-similar set from a collection of similitudes. 


THEOREM 10.13.1 


I£ T1, T3, T3, ..., Ту are contracting similitudes with the same scale factor, then there is a unique nonempty closed and bounded set 5 in the Euclidean plane 
such that 


S—Tó(S) U T3(S) U T3(S) U...U T&(S) 
Furthermore, if the sets T4 (S), 72(5), 73(S), ..., T(S) are nonoverlapping, then 5 is self-similar. 


Algorithms for Generating Fractals 


In general, there is no simple way to obtain the set S in the preceding theorem directly. We now describe an iterative procedure that will determine S from the 
similitudes that define it. We first give an example of the procedure and then give an algorithm for the general case. 


EXAMPLE 11 Sierpinski Carpet — 


Figure 10.13.15 shows the unit square region 50 in the xy-plane, which will serve as an “initial” set for an iterative procedure for the construction of the 
Sierpinski carpet. The set 5 in the figure is the result of mapping 50 with each of the eight similitudes 7; (i = 1, 2, ..., 8) in 8 that determine the 
Sierpinski carpet. It consists of eight square regions, each of side length 3 surrounding an empty middle square. Next we apply the eight similitudes to 
51 and arrive at the set S5. Similarly, applying the eight similitudes to S^ results in the set 53. It we continue this process indefinitely, the sequence of 
sets Sy, S5, 53, ... will “converge” to a set S, which is the Sierpinski carpet. 


(l, 1) 
(0, 1) 


(0, 0) (1. 0) 


Figure 10.13.15 


Remark Although we should properly give a definition of what it means for a sequence of sets to "converge" to a given set, an intuitive interpretation will suffice in 
this introductory treatment. 


Although we started in Figure 10.13.15 with the unit square region to arrive at the Sierpinski carpet, we could have started with any nonempty set Sp. The only 
restriction is that the set 50 be closed and bounded. For example, if we start with the particular set Sg shown in Figure 10.13.16, then S, is the set obtained by 
applying each of the eight similitudes in 8. Applying the eight similitudes to 5 results in the set 52. As before, applying the eight similitudes indefinitely yields the 
Sierpinski carpet S as the limiting set. 
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The general algorithm illustrated in the preceding example is as follows: Let T4, T2, Тз, ..., T, be contracting similitudes with the same scale factor, and for an 
arbitrary set О in д2, define the set 7 (Q) by 


F(Q) = Ti(Q) U T4(Q) U T3(Q) u...u TRO) 
The following algorithm generates a sequence of sets Sp, 51, ..., Sy, ... that converges to the set < in Theorem 10.13.1. 
Algorithm 1 
Step 0 Choose an arbitrary nonempty closed and bounded set 50 in Re. 
Step 1 Compute S, = 7 (50). 
Step 2 Compute 85 = 7(S}). 
Step 3 Compute 53 = J (52). 


Step n Compute S, = J (S5. 1). 


EXAMPLE 12 Sierpinski Triangle + 


Let us construct the Sierpinski triangle determined by the three similitudes given in 10. The corresponding set mapping is 
700) = 7100) 0700) U T3(Q). Figure 10.13.17 shows an arbitrary closed and bounded set Sp; the first four iterates 57, S5, 53, S4; and the limiting 
set S (the Sierpinski triangle). 
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EXAMPLE 13 Using Algorithm1 + 


Consider the following two similitudes: 


ШЕ) 
g) 


” 


CLP PI 
* 
v^ 
"I 


и 

ми 

"I 
PUPPI 


A 
Annn 

* 
A 
^^ 
A 


^ 
LaL al al ad 
^ 


[Ad 


Figure 10.13.17 
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The actions of these two similitudes on the unit square U are illustrated in Figure 10.13.18. Here, the rotation angle 0 is a parameter that we will vary to 
generate different self-similar sets. The self-similar sets determined by these two similitudes are shown in Figure 10.13.19 for various values of 0. For 
simplicity, we have not drawn the xy-axes, but in each case the origin is the lower left point of the set. These sets were generated on a computer using 


Algorithm 1 for the various values of 0. Because i; = 2 and s = 


i it follows from 2 that the Hausdorff dimension of these sets for any value of 0 is 1. It 


can be shown that the topological dimension of these sets is 1 for — Q and 0 for all other values of 0. It follows that the self-similar set for @ = Q is not 
a fractal [it is the straight line segment from (0, 0) to (.6, .6)], while the self-similar sets for all other values of 0 are fractals. In particular, they are 


examples of fractals with integer Hausdorff dimension. 
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A Monte Carlo Approach 
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The set-mapping approach of constructing self-similar sets described in Algorithm 1 is rather time-consuming on a computer because the similitudes involved must be 
applied to each of the many computer screen pixels in the successive iterated sets. In 1985 Michael Barnsley described an alternative, more practical method of 
generating a self-similar set defined through its similitudes. It is a so-called Monte Carlo method that takes advantage of probability theory. Barnsley refers to it as 
the Random Iteration Algorithm. 


Let T4, T5, Тз, ..., Th be contracting similitudes with the same scale factor. The following algorithm generates a sequence of points 


bo bebes Dalee 


that collectively converge to the set $ in Theorem 10.13.1. 
Algorithm 2 


х 
Step 0 Choose an arbitrary point | | in S. 


Step 1 Choose one of the k similitudes at random, say Ty, and compute 
Step 2 Choose one of the k similitudes at random, say Tka and compute 


Step n Choose one of the k similitudes at random, say Thy and compute 
Xn Xn-1 
= Т 
раја) 
On a computer screen the pixels corresponding to the points generated by this algorithm will fill out the pixel representation of the limiting set S. 


Figure 10.13.20 shows four stages of the Random Iteration Algorithm that generate the Sierpinski carpet, starting with the initial point lo] 


Remark Although Step 0 in the preceding algorithm requires the selection of an initial point in the set S, which may not be known in advance, this is not a serious 
problem. In practice, one can usually start with any point in 22 and after a few iterations (say ten or so), the point generated will be sufficiently close to S that the 


algorithm will work correctly from that point on. 
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Figure 10.13.20 


More General Fractals 


So far, we have discussed fractals that are self-similar sets according to the definition of a self-similar set іп д2. However, Theorem 10.13.1 remains true if the 


similitudes 7, 75, ..., Th are replaced by more general transformations, called contracting affine transformations. An affine transformation is defined as follows: 


DEFINITION 5 


An affine transformation is a mapping of #2 into R? of the form 


where a, b, c, d, e, and fare scalars. 


Figure 10.13.21 shows how an affine transformation maps the unit square U onto a parallelogram F(U). An affine transformation is said to be contracting if the 
Euclidean distance between any two points in the plane is strictly decreased after the two points are mapped by the transformation. It can be shown that any k 
contracting affine transformations 7}, T3, ..., Tj, determine a unique closed and bounded set 5 satisfying the equation 


б=Т|(5) U T4(S) U T3(S) U...U T&(S) (13) 


Equation 13 has the same form as Equation 12, which we used to find self-similar sets. Although Equation 13, which uses contracting affine transformations, does not 
determine a self-similar set S, the set it does determine has many of the features of self-similar sets. For example, Figure 10.13.22 shows how a set in the plane 
resembling a fern (an example made famous by Barnsley) can be generated through four contracting affine transformations. Note that the middle fern is the slightly 
overlapping union of the four smaller affine-image ferns surrounding it. Note also how F3, because the determinant of its matrix part is zero, maps the entire fern onto 
the small straight line segment between the points (.50, 0) and (.50, . 16). Figure 10.13.22 contains a wealth of information and should be studied carefully. 
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Figure 10.13.21 
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Figure 10.13.22 


Michael Barnsley has applied the above theory to the field of data compression and transmission. The fern, for example, is completely determined by the four affine 
transformations 7, 73, Тз, 74. These four transformations, in turn, are determined by the 24 numbers given in Figure 10.13.22 defining their corresponding values 


of a, b, c, d, e, and f. In other words, these 24 numbers completely encode the picture of the fern. Storing these 24 numbers in a computer requires considerably less 
memory space than storing a pixel-by-pixel description of the fern. In principle, any picture represented by a pixel map on a computer screen can be described 
through a finite number of affine transformations, although it is not easy to determine which transformations to use. Nevertheless, once encoded, the affine 
transformations generally require several orders of magnitude less computer memory than a pixel-by-pixel description of the pixel map. 


Further Readings 


Readers interested in learning more about fractals are referred to the following books, the first of which elaborates on the linear transformation approach of 
this section. 


1. Michael Barnsley, Fractals Everywhere (New York: Academic Press, 1993). 
2. Benoit B. Mandelbrot, The Fractal Geometry of Nature (New York: W. H. Freeman, 1982). 
3. Heinz-Otto Peitgen and P. H. Richter, The Beauty of Fractals (New York: Springer-Verlag, 1986). 


4. Heinz-Otto Peitgen and Dietmar Saupe, The Science of Fractal Images (New York: Springer-Verlag, 1988). 


Exercise Set 10.13 


1. The self-similar set in Figure Ex-1 has the sizes indicated. Given that its lower left corner is situated at the origin of the xy-plane, find the similitudes that 
determine the set. What is its Hausdorff dimension? Is it a fractal? 


Figure Ex-1 
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. Find the Hausdorff dimension of the self-similar set shown in Figure Ex-2. Use a ruler to measure the figure and determine an approximate value of the scale factor 
s. What are the rotation angles of the similitudes determining this set? 


Figure Ex-2 


Answer: 


ez 47; d (S) zx In(4) / In(1/ 47) = 1.8 . ... Rotation angles: 0° (upper left); —90° (upper right); 180° (lower left); 180° (lower right) 


w 


• Each of the 12 self-similar sets in Figure Ex-3 results from three similitudes with scale factor of 1, and so all have Hausdorff dimension [n 3 / |n 2 = 1.584... The 


rotation angles of the three similitudes are all multiples of 99°. Find these rotation angles for each set and express them as a triplet of integers (#1, 27, 3), where 
? is the corresponding integer multiple of 90° in the order upper right, lower left, lower right. For example, the first set (the Sierpinski triangle) generates the 
triplet (0, 0, 0). 


Figure Ex-3 


Answer: 


(0, 0, 05, (1, 0, 0), (2, 0, 0), (3, 0, 0), (0, 0, 1), (0, 0, 2), (1, 2, 0), (2, 1, 3), (2, 0, 1), (2, 0, 2), (2, 2, 0), (0, 3, 3) 
4. For each of the self-similar sets in Figure Ex-4, find: 

(i) the scale factor s of the similitudes describing the set; 

(ii) the rotation angles @ of all similitudes describing the set (all rotation angles are multiples of 90°); and 

(iii) the Hausdorff dimension of the set. 

Which of the sets are fractals and why? 


Figure Ex-4 


Answer: 


(а) (1) == 2, (ii) all rotation angles are Q^; (iii) 2 g(S) = №(7) / (3) = 1.771 . ... This set is a fractal. 
(0) (== t Gi) all rotation angles are 180°; (iii) d (5) —1n(3) / In(2) = 1.584 . ... This set is а fractal. 
(с) (1) == 2; (ii) rotation angles: —99° (top); 180° (lower left); 180° (lower right); (iii) d (5) = (3) / In(2) = 1.584 . ... This set is a fractal. 
(d) (1) == 2; (ii) rotation angles: 90° (upper left); 190° (upper right); 180° (lower right); (iii) d (S) = (3) / (2) = 1.584 . ... This set is a fractal. 
5. Show that of the four affine transformations shown in Figure 10.13.22, only the transformation 7 is a similitude. Determine its scale factor s and rotation angle g. 


Answer: 


s=.8509..,0= —2. 69°... 
6. Find the coordinates of the tip of the fern in Figure 10.13.22. [Hint: The transformation 7 maps the tip of the fern to itself.] 
Answer: 


(0.766, 0.996) rounded to three decimal places 


7. The square in Figure 10.13.7a was expressed as the union of 4 nonoverlapping squares as in Figure 10.13.75. Suppose that it is expressed instead as the union of 
16 nonoverlapping squares. Verify that its Hausdorff dimension is still 2, as determined by Equation 2. 


Answer: 


d g(S) —1n(16) /In(4) 22 
8. Show that the four similitudes 


[x _ 3/1 O|[x 
{| " i|: dH 
151) = ap °][®]+ |3] 
ZH = alo iJ]: К 

Е = 0] 

x _ By 1 OJ|[x 
ZH = ih AME i 

il 

[x _ 3/1 Olfx 4 

EH = álo | МЕ 1 
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express the unit square as the union of four overlapping squares. Evaluate the right-hand side of Equation 2 for the values of k and s determined by these 
similitudes, and show that the result is not the correct value of the Hausdorff dimension of the unit square. [Note: This exercise shows the necessity of the 
nonoverlapping condition in the definition of a self-similar set and its Hausdorff dimension.] 


Answer: 
4 
In(4) (а) 4818... 


9. АП of the results in this section can be extended to R”. Compute ће Hausdorff dimension of ће unit cube іп 23 (see Figure Ex-9). Given that ће topological 
dimension of the unit cube is 3, determine whether it is a fractal. [Hint: Express the unit cube as the union of eight smaller congruent nonoverlapping cubes.] 


Figure Ex-9 


Answer: 


d HIS) = In(8) /In(2) = 3; the cube is not a fractal. 


10. The set in 2 in Figure Ex-10 is called the Menger sponge. It is a self-similar set obtained by drilling out certain square holes from the unit cube. Note that each 


face of the Menger sponge is a Sierpinski carpet and that the holes in the Sierpinski carpet now run all the way through the Menger sponge. Determine the values 
of k and s for the Menger sponge and find its Hausdorff dimension. Is the Menger sponge a fractal? 


Figure Ex-10 


Answer: 


k—20:5— Е d g(S) —1n(20) /ln(3) = 2.726..; the set is a fractal. 


11. The two similitudes 


2 
x] 1 1 O> = 
(Б) 3p] 5 
0 
determine a fractal known as the Cantor set. Starting with the unit square region U as an initial set, sketch the first four sets that Algorithm 1 determines. Also, 


find the Hausdorff dimension of the Cantor set. (This famous set was the first example that Hausdorff gave in his 1919 paper of a set whose Hausdorff dimension 
is not equal to its topological dimension.) 


Answer: 


Initial set 


First iterate 


i | | HN Second iterate 


ни mu па =s Third iterate 


^ = Fourth iterate 


d g(S) =In(2) / In(3) = 0.6309... 


12. Compute the areas of the sets 50, Sy, 59, 53, and S4 in Figure 11.13.15. 
Answer: 
8 gv gv gy 
Area of 50 = 1; area of Sy = 9 = 0.888... ; area of S; = (8) = 0.790... ; area of 53 = (8) = 0.702... ; area of S4 = (8) = 0.624... 


Section 10.13 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, Maple, Derive, or Mathcad, but it may 
also be some other type of linear algebra software or a scientific calculator with some linear algebra capabilities. For each exercise you will need to read the relevant 
documentation for the particular utility you are using. The goal of these exercises is to provide you with a basic proficiency with your technology utility. Once you 
have mastered the techniques in these exercises, you will be able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. Use similitudes of the form 


to show that the Menger sponge (see Exercise 10) is the set S satisfying 
20 
s= U no 
for appropriately chosen similitudes Т; (for i = 1, 2, 3, ..., 20). Determine these similitudes by determining the collection of 3 x | matrices 
aj 
bj || for? = 1, 2, 5,..., 20 
Ci 


T2. Generalize the ideas involved in the Cantor set (in 1), the Sierpinski carpet (in 22), and the Menger sponge (in 23) to R” by considering the set S satisfying 


T» 
s= U no 
i= 
with 
X1 1 0... Ох ац 
x2 1919.0 x2 ay 
Ti[x3||7310 0 1 ... 01/43] +] 23i 
Ху 000 . 1и ayy 
where each a; equals 0, i or 2, and no two of them ever equal i at the same time. Use a computer to construct the set 
ayy 
a 


аз || fori — 1,2, 3,..., my 


thereby determining the value of?! for »; = 2, 3, 4. Then develop an expression for y. 
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10.14 Chaos 


In this section we use a map of the unit square in the xy-plane onto itself to describe the concept of a chaotic mapping. 


Prerequisites 


Geometry of Linear Operators on 22 (Section 4.11) 
Eigenvalues and Eigenvectors 


Intuitive Understanding of Limits and Continuity 


Chaos 


The word chaos was first used in a mathematical sense in 1975 by Tien-Yien Li and James Yorke in a paper entitled “Period 
Three Implies Chaos." The term is now used to describe the behavior of certain mathematical mappings and physical phenomena 
that at first glance seem to behave in a random or disorderly fashion but actually have an underlying element of order (examples 
include random-number generation, shuffling cards, cardiac arrhythmia, fluttering airplane wings, changes in the red spot of 
Jupiter, and deviations in the orbit of Pluto). In this section we discuss a particular chaotic mapping called Arnold's cat map, after 
the Russian mathematician Vladimir I. Arnold who first described it using a diagram of a cat. 


Arnold's Cat Map 


To describe Arnold's cat map, we need a few ideas about modular arithmetic. If x is a real number, then the notation x mod 1 
denotes the unique number in the interval [0, 1) that differs from x by an integer. For example, 

2.3mod1=0.3, 0.9mod1—0.3, = 3.7 mod1=0.3, 2.0mod1=0 
Note that if x is a nonnegative number, then x mod 1 is simply the fractional part of x. If (x, у) is an ordered pair of real numbers, 
then the notation (x, у) mod 1 denotes (x mod 1, y mod 1). For example, 

(2.3, = 7.9) mod 1 = (0.3, 0.1) 

Observe that for every real number х, the point x mod 1 lies in the unit interval [0, 1) and that for every ordered pair (x, у), the 
point (x, y) mod 1 lies in the unit square 


S= {(х,у) 
Also observe that the upper boundary and the right-hand boundary of the square are not included in S. 


0<х< 1, 0<у< 1) 


Arnold's cat map is the transformation Г. R? EN R? defined by the formula 
Г: (х, y) — (x +y, x +2y) mod 1 


E-E = : 


To understand the geometry of Arnold's cat map, it is helpful to write 1 in the factored form 


PDL lb dol 


which expresses Arnold's cat map as the composition of a shear in the x-direction with factor 1, followed by a shear in the 
y-direction with factor 1. Because the computations are performed mod 1, Г maps all points of 22 into the unit square S. 


or, in matrix notation, 


We will illustrate the effect of Arnold's cat map on the unit square S, which is shaded in Figure 10.14.1a and contains a picture of 
a cat. It can be shown that it does not matter whether the mod 1 computations are carried out after each shear or at the very end. 
We will discuss both methods, first performing them at the end. The steps are as follows: 


Step 1 Shear in the x-direction with factor 1 (Figure 10.14.15): 
(x, y) (х +, y) 


DUM 


Step 2 Shear in the y-direction with factor 1 (Figure 10.14.1c): 
(x, y) = (x, x +y) 


| }[>]=[х+»] 


(х,у) — (х, у) mod 1 


or in matrix notation 


or, in matrix notation, 


Step 3 Reassembly into S (Figure 10.14.14): 


The geometric effect of the mod 1 arithmetic is to break up the parallelogram in Figure 10.14.1c and reassemble the pieces of S as 
shown in Figure 10.14.1d. 


3; T 3r- зг 3 
Step 1: Step 2: | 
(х,у) => (x +y, у) (x, y) э OY x t y) 


3 г 
Step 3: 
(x. y) — (x. y) mod 1 


N 
N 
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(а) (b) (c) (d) 
Figure 10.14.1 


For computer implementation, it is more convenient to perform the mod 1 arithmetic at each step, rather than at the end. With this 
approach there is a reassembly at each step, but the net effect is the same. The steps are as follows: 


Step 1 Shear in the x-direction with factor 1, followed by a reassembly into 5 (Figure 10.14.25): 
(x, y) — (Gt y, y) mod 1 

Step 2 Shear in the y-direction with factor 1, followed by a reassembly into S (Figure 10.14.2c): 
(x, y) — (x, x + y) mod 1 


Step 1: Step 2: 
2 


(х,у) 3 (x +y, у) (x, y) — (x, y) mod 1 / | (х, y) — (x, y) mod I 
— | — 1[— ' у D» 
| № 7 
2 0 1 


һә 
N 
N 

ә 


Qr. y) = Oy x + y) 
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Figure 10.14.2 


Repeated Mappings 


Chaotic mappings such as Arnold's cat map usually arise in physical models in which an operation is performed repeatedly. For 
example, cards are mixed by repeated shuffles, paint is mixed by repeated stirs, water in a tidal basin is mixed by repeated tidal 
changes, and so forth. Thus, we are interested in examining the effect on S of repeated applications (or iterations) of Arnold's cat 
map. Figure 10.14.3, which was generated on a computer, shows the effect of 25 iterations of Arnold's cat map on the cat in the 
unit square S. Two interesting phenomena occur: 

* The cat returns to its original form at the 25th iteration. 


• At some of the intermediate iterations, the cat is decomposed into streaks that seem to have a specific direction. 


Much of the remainder of this section is devoted to explaining these phenomena. 
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Figure 10.14.3 


Periodic Points 


Our first goal is to explain why the cat in Figure 10.14.3 returns to its original configuration at the 25th iteration. For this purpose 
it will be helpful to think of a picture in the xy-plane as an assignment of colors to the points in the plane. For pictures generated 
on a computer screen or other digital device, hardware limitations require that a picture be broken up into discrete squares, called 
pixels. For example, in the computer-generated pictures in Figure 10.14.3 the unit square S is divided into a grid with 101 pixels 
on a side for a total of 10,201 pixels, each of which is black or white (Figure 10.14.4). An assignment of colors to pixels to create 


a picture is called a pixel map. 


Enlarged view of cat's face 


showing individual pixels 


Figure 10.14.4 


As shown in Figure 10.14.5, each pixel in S can be assigned a unique pair of coordinates of the form (2 / 101, м / 101) that 
identifies its lower left-hand corner, where m and n are integers in the range 0, 1, 2, ..., 100. We call these points pixel points 
because each such point identifies a unique pixel. Instead of restricting the discussion to the case where $ is subdivided into an 
array with 101 pixels on a side, let us consider the more general case where there are p pixels per side. Thus, each pixel map in S 
consists of р? pixels uniformly spaced 1 / p units apart in both the x- and the y-directions. The pixel points in S have coordinates 


of the form (7 / p, n / p), where m and n are integers ranging from 0 to p — 1. 
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Figure 10.14.5 


Under Arnold's cat map each pixel point of S is transformed into another pixel point of S. To see why this is so, observe that the 
image of the pixel point (;» / p, x / p) under [' is given in matrix form by 


m т m+n 
Р 1 1| P Р 

Г " -[ 4 " mod 1 = "9 mod 1 (2) 
Р Р р 


The ordered pair ( (p +7) / p, (т + 27) / р) mod 1 is of the form (m / p, x’ / p), where x” and y" lie in the range 
0, 1, 2,..., p = 1. Specifically, ;»' and »' are the remainders when 5j 4- » and ру: + 2» are divided by p, respectively. 
Consequently, each point in 5 of the form (#2 / p, n / p) is mapped onto another point of the same form. 


Because Arnold's cat map transforms every pixel point of S into another pixel point of S, and because there are only р? different 


pixel points in S, it follows that any given pixel point must return to its original position after at most р? iterations of Arnold's cat 


map. 
EXAMPLE 1 Using Formula2 + 


If p — 76, then 2 becomes 


76 76 
Г = mod 1 
EG m+ 2n 
76 76 
| г. | 58 
In this case the successive iterates of the point 76° 76 are 

0 1 2 3 4 5 6 7 8 
27 9 0 67 49 4 39 37 72 
76 - 76 E" 76 e 76 E 76 76 =, 76 EM 76 EN 76 
58 67 67 58 3l 35 74 35 31 
76 76 76 76 76 76 76 76 76 


(verify). Because the point returns to its initial position on the ninth application of Arnold's cat map (but no sooner), 
the point is said to have period 9, and the set of nine distinct iterates of the point is called a 9-cycle. Figure 10.14.6 
shows this 9-cycle with the initial point labeled 0 and its successive iterates labeled accordingly. 
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Figure 10.14.6 


In general, a point that returns to its initial position after n applications of Arnold's cat map, but does not return with fewer than n 
applications, is said to have period n, and its set of n distinct iterates is called an n-cycle. Arnold's cat map maps (0, 0) into 

(0, 0), so this point has period 1. Points with period 1 are also called fixed points. We leave it as an exercise (Exercise 11) to 
show that (0, 0) is the only fixed point of Arnold's cat map. 


Period Versus Pixel Width 


If P4 and Рз are points with periods 41 and 22, respectively, then P, returns to its initial position in d iterations (but no sooner), 
and Рэ returns to its initial position in 22 iterations (but no sooner); thus, both points return to their initial positions in any number 
of iterations that is a multiple of both 41 and 92. In general, for a pixel map with р? pixel points of the form (7 ! p, » р), we let 


II(p) denote the least common multiple of the periods of all the pixel points in the map [i.e., П(р) is the smallest integer that is 
divisible by all of the periods]. It follows that the pixel map will return to its initial configuration in II( p) iterations of Arnold's 
cat map (but no sooner). For this reason, we call II( p) the period of the pixel map. In Exercise 4 we ask you to show that if 

p — 101, then all pixel points have period 1, 5, or 25, so II(101) — 25. This explains why the cat in Figure 10.14.3 returned to 
its initial configuration in 25 iterations. 


Figure 10.14.7 shows how the period of a pixel map varies with p. Although the general tendency is for the period to increase as p 
increases, there is a surprising amount of irregularity in the graph. Indeed, there is no simple function that specifies this 
relationship (see Exercise 1). 
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Figure 10.14.7 


Although a pixel map with p pixels on a side does not return to its initial configuration until [I( p) iterations have occurred, 
various unexpected things can occur at intermediate iterations. For example, Figure 10.14.8 shows a pixel map with p — 250 of 
the famous Hungarian-American mathematician John von Neumann. It can be shown that I1(250) = 750; hence, the pixel map 
will return to its initial configuration after 750 iterations of Arnold's cat map (but no sooner). However, after 375 iterations the 
pixel map is turned upside down, and after another 375 iterations (for a total of 750) the pixel map is returned to its initial 
configuration. Moreover, there are so many pixel points with periods that divide 750 that multiple ghostlike images of the original 
likeness occur at intermediate iterations; at 195 iterations numerous miniatures of the original likeness occur in diagonal rows. 
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Figure 10.14.8 


The Tiled Plane 


Our next objective is to explain the cause of the linear streaks that occur in Figure 10.14.3. For this purpose it will be helpful to 
view Arnold's cat map another way. As defined, Arnold's cat map is not a linear transformation because of the mod 1 arithmetic. 
However, there is an alternative way of defining Arnold's cat map that avoids the mod 1 arithmetic and results in a linear 
transformation. For this purpose, imagine that the unit square 5 with its picture of the cat is a “tile,” and suppose that the entire 
plane is covered with such tiles, as in Figure 10.14.9. We say that the xy-plane has been filed with the unit square. If we apply the 
matrix transformation in 1 to the entire tiled plane without performing the mod 1 arithmetic, then it can be shown that the portion 
of the image within S will be identical to the image that we obtained using the mod 1 arithmetic (Figure 10.14.9). In short, the 
tiling results in the same pixel map in S as the mod 1 arithmetic, but in the tiled case Arnold's cat map is a linear transformation. 


Step 1: Step 2: Step 3: 
(x. y) > (x +y. у) (x. y) > (x x + y) (x, y) — (x, y) mod 1 
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Figure 10.14.9 


It is important to understand, however, that tiling and mod 1 arithmetic produce periodicity in different ways. If a pixel map in S 
has period n, then in the case of mod 1 arithmetic, each point returns to its original position at the end of n iterations. In the case 
of tiling, points need not return to their original positions; rather, each point is replaced by a point of the same color at the end of n 
iterations. 


Properties of Arnold's Cat Map 


To understand the cause of the streaks in Figure 10.14.3, think of Arnold's cat map as a linear transformation on the tiled plane. 
Observe that the matrix 
11 
Cz 


that defines Arnold's cat map is symmetric and has a determinant of 1. The fact that the determinant is 1 means that multiplication 
by this matrix preserves areas; that is, the area of any figure in the plane and the area of its image are the same. This is also true 
for figures in S in the case of mod 1 arithmetic, since the effect of the mod 1 arithmetic is to cut up the figure and reassemble the 
pieces without any overlap, as shown in Figure 10.14.1d. Thus, in Figure 10.14.3 the area of the cat (whatever it is) is the same as 
the total area of the blotches in each iteration. 


The fact that the matrix is symmetric means that its eigenvalues are real and the corresponding eigenvectors are perpendicular. We 
leave it for you to show that the eigenvalues and corresponding eigenvectors of C are 


ae ы 2 —26180., x2 305. 03819. 
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For each application of Arnold's cat map, the eigenvalue A, causes a stretching in the direction of the eigenvector V, by a factor 
of 2.6180..., and the eigenvalue Аз causes a compression in the direction of the eigenvector V2 by a factor of , 3819... Figure 
10.14.10 shows a square centered at the origin whose sides are parallel to the two eigenvector directions. Under the above 
mapping, this square is deformed into the rectangle whose sides are also parallel to the two eigenvector directions. The area of the 


square and rectangle are the same. 


Figure 10.14.10 


To explain the cause of the streaks in Figure 10.14.3, consider S to be part of the tiled plane, and let p be a point of S with period 

n. Because we are considering tiling, there is a point q in the plane with the same color as p that on successive iterations moves 

toward the position initially occupied by p, reaching that position on the nth iteration. This point is q = (А ty р = А "p. since 
A"q = A" (A p) =p 

Thus, with successive iterations, points of S flow away from their initial positions, while at the same time other points in the plane 

(with corresponding colors) flow toward those initial positions, completing their trip on the final iteration of the cycle. Figure 


10.14.11 illustrates this in the case where y = 4,q= | — 8 à , and p= Аё = 1 2 . Note that 
3 3] P 3'3 
p mod 1 = q mod 1 = [5 E so both points occupy the same positions on their respective tiles. The outgoing point moves in 


the general direction of the eigenvector V1, as indicated by the arrows in Figure 10.14.11, and the incoming point moves in the 
general direction of eigenvector V2. It is the “flow lines” in the general directions of the eigenvectors that form the streaks in 
Figure 10.14.3. 
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Figure 10.14.11 


Nonperiodic Points 


Thus far we have considered the effect of Arnold's cat map on pixel points of the form (#2 / p, » / p) for an arbitrary positive 
integer p. We know that all such points are periodic. We now consider the effect of Arnold's cat map on an arbitrary point (а, è) 
in S. We classify such points as rational if the coordinates a and b are both rational numbers, and irrational if at least one of the 
coordinates is irrational. Every rational point is periodic, since it is a pixel point for a suitable choice of p. For example, the 
rational point (r4 / 51, 72 / 52) can be written as (7152 / 5152, 7281 Í 5152), so it is a pixel point with P = 5152. It can be shown 
(Exercise 13) that the converse is also true: Every periodic point must be a rational point. 


It follows from the preceding discussion that the irrational points in S аге nonperiodic, so that successive iterates of an irrational 
point (xg, yg) in S must all be distinct points in S. Figure 10.14.12, which was computer generated, shows an irrational point and 
selected iterates up to 100,000. For the particular irrational point that we selected, the iterates do not seem to cluster in any 
particular region of S; rather, they appear to be spread throughout S, becoming denser with successive iterations. 
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Figure 10.14.12 


The behavior of the iterates in Figure 10.14.12 is sufficiently important that there is some terminology associated with it. We say 
that a set D of points in S is dense in S if every circle centered at any point of S encloses points of D, no matter how small the 
radius of the circle is taken (Figure 10.14.13). It can be shown that the rational points are dense in S and the iterates of most (but 
not all) of the irrational points are dense in $. 
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Figure 10.14.13 


Definition of Chaos 


We know that under Arnold's cat map, the rational points of S are periodic and dense in S and that some but not all of the 
irrational points have iterates that are dense in S. These are the basic ingredients of chaos. There are several definitions of chaos in 


current use, but the following one, which is an outgrowth of a definition introduced by Robert L. Devaney in 1986 in his book Ал 
Introduction to Chaotic Dynamical Systems (Benjamin/Cummings Publishing Company), is most closely related to our work. 


DEFINITION 1 


A mapping T of S onto itself is said to be chaotic if: 
(i) S contains a dense set of periodic points of the mapping 7. 


(ii) There is a point in S whose iterates under 7 are dense in S. 


Thus Arnold's cat map satisfies the definition of a chaotic mapping. What is noteworthy about this definition is that a chaotic 
mapping exhibits an element of order and an element of disorder—the periodic points move regularly in cycles, but the points 
with dense iterates move irregularly, often obscuring the regularity of the periodic points. This fusion of order and disorder 
characterizes chaotic mappings. 


Dynamical Systems 


Chaotic mappings arise in the study of dynamical systems. Informally stated, a dynamical system can be viewed as a system that 
has a specific state or configuration at each point of time but that changes its state with time. Chemical systems, ecological 
systems, electrical systems, biological systems, economic systems, and so forth can be looked at in this way. In a discrete-time 
dynamical system, the state changes at discrete points of time rather than at each instant. In a discrete-time chaotic dynamical 
system, each state results from a chaotic mapping of the preceding state. For example, if one imagines that Arnold's cat map is 
applied at discrete points of time, then the pixel maps in Figure 10.14.3 can be viewed as the evolution of a discrete-time chaotic 
dynamical system from some initial set of states (each point of the cat is a single initial state) to successive sets of states. 


One of the fundamental problems in the study of dynamical systems is to predict future states of the system from a known initial 
state. In practice, however, the exact initial state is rarely known because of errors in the devices used to measure the initial state. 
It was believed at one time that if the measuring devices were sufficiently accurate and the computers used to perform the 
iteration were sufficiently powerful, then one could predict the future states of the system to any degree of accuracy. But the 
discovery of chaotic systems shattered this belief because it was found that for such systems the slightest error in measuring the 
initial state or in the computation of the iterates becomes magnified exponentially, thereby preventing an accurate prediction of 
future states. Let us demonstrate this sensitivity to initial conditions with Arnold's cat map. 


Suppose that Py is a point in the xy-plane whose exact coordinates are (0.77837, 0.70904). A measurement error of 0.00001 is 
made in the y-coordinate, such that the point is thought to be located at (0.77837, 0.70905), which we denote by Qg. Both Pg 
and Qg are pixel points with p = 100, 000 (why?), and thus, since П(100, 000) = 75, 000, both return to their initial positions 
after 75,000 iterations. In Figure 10.14.14 we show the first 50 iterates of Pg under Arnold's cat map as crosses and the first 50 
iterates of Og as circles. Although Py and 0р are close enough that their symbols overlap initially, only their first eight iterates 
have overlapping symbols; from the ninth iteration on their iterates follow divergent paths. 


Figure 10.14.14 


It is possible to quantify the growth of the error from the eigenvalues and eigenvectors of Arnold's cat map. For this purpose we 
will think of Arnold's cat map as a linear transformation on the tiled plane. Recall from Figure 10.14.10 and the related discussion 
that the projected distance between two points in 5 in the direction of the eigenvector V, increases by a factor of 2.6180...( = Ay) 
with each iteration (Figure 10.14.15). After nine iterations this projected distance increases by a factor of 

(2.6180. 3" = 5771.99... and with an initial error of roughly 1 / 100, 000 in the direction of V1, this distance is 0,05777... or 


about i the width of the unit square S. After 12 iterations this small initial error grows to (2.6180...) 12, 100, 000 = 1.0368... 


which is greater than the width of S. Thus, we completely lose track of the true iterates within S after 12 iterations because of the 
exponential growth of the initial error. 


Figure 10.14.15 


Although sensitivity to initial conditions limits the ability to predict the future evolution of dynamical systems, new techniques 
are presently being investigated to describe this future evolution in alternative ways. 


Exercise Set 10.14 


1. In a journal article [F. J. Dyson and Н. Falk, “Period of a Discrete Cat Mapping,” The American Mathematical Monthly, 99 
(August-September 1992), pp. 603—614] the following results concerning the nature of the function II( p) were established: 


@ П(р) —3p if and only if p —2 - 5* fork — 1, 2, ... 

(ii) П(р) = 2p if and only if p = 5* fork = 1,2, ...or p —6- 5* fork — 0, 1, 2,... 
(iii) П(р) < 12p / 7 for all other choices of p. 

Find П(250), П(25), П(125), П(30), П(10), П(50), I1(3750), П(6), and П(5). 


Answer: 


П(250) = 750, П(25) = 50, П(125) = 250, П(30) = 60, П(10) = 30, П(50) = 150, П(3750) = 7500, П(6) = 12, 
П(5) = 10 


N 


. Find all the n-cycles that are subsets of the 36 points in S of the form (#2 / 6, x / 6) with m and n in the range 0, 1, 2, 3, 4, 5. 
Then find П(6). 


Answer: 


| ‚/{3 33 2il г a 4 4| [2 2 2 
One I-cycle: ((0, 0)) ; one 3-cycle: (2.0) (2. 3! o 2) wo cycles: (0) H 3! (2, 0), 2 and 


2\ (24 4\ f4 2|. z | 
(б el n 3, (б. Б} (2. ; two 12-cycles: 


о) (6:6) 6:8) Ж (2-2) (68) 0). (8) 6). (68) 68) (8) о 


069) (е) 6-5) (9-5) (6-8) (6-6) (6°) (6-2) (6:6) (к) 65) (6-8) п®- 


3. (Fibonacci Shift-Register Random-Number Generator) A well-known method of generating a sequence of “pseudorandom” 


integers xp, x1, X2, x3, ... in the interval from 0 to p — 1 is based on the following algorithm: 

(i) Pick any two integers ху and x1 from the range 0, 1, 2,.., p = 1. 

(ii) Set x44. = (xy + x4 1) mod p for x = 1, 2,.... 

Here x mod p denotes the number in the interval from 0 to p — 1 that differs from x by a multiple of p. For example, 35 mod 
9 = 8 (because 8 = 35 — 3 · 9); 36 mod 9 = 0 (because 0 = 36 — 4 - 9); and —3 mod 9 = 6 (because 6 = — 3 + 1 - 9). 


(a) Generate the sequence of pseudorandom numbers that results from the choices p = 15, xg = 3, and x, = 7 until the 
sequence starts repeating. 


(b) Show that the following formula is equivalent to step (11) of the algorithm: 


Хи+1 1 1il[Xa-1 
PEE ‚| Xn [moa forn = 1,2, 3,... 


(c) Use the formula in part (b) to generate the sequence of vectors for the choices p = 21, xg = 5, and x, = 5 until the 
sequence starts repeating. 


Answer: 


(a) 3, 7, 10, 2, 12, 14, 11, 10, 6, 1, 7, 8, 0, 8, 8, 1, 9, 10, 4, 14, 3, 2, 5, 7, 12, 4, 1, 5, 6, 11, 2, 13, 0, 13, 13, 11,9, 5, 14, 4, 3, 7, 
(c) (5, 5), (10, 15), (4, 19), (2, 0), (2, 2), (4, 6), (10, 16), (5, 0), (5, 5)... 


Remark If we take p = 1 and pick х0 and x1 from the interval [0, 1), then the above random-number generator produces 
pseudorandom numbers in the interval [0, 1). The resulting scheme is precisely Arnold's ct map. Furthermore, if we eliminate 
the modular arithmetic in the algorithm and take хо = x, = 1, then the resulting sequence of integers is the famous Fibonacci 


sequence, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, .. 


„ In which each number after the first two is the sum of the preceding two 


numbers. 


4 


Un 


6. 


1 
"ForC = 
sc-[! 


>} it can be verified that 


2 7, 778, 742,049 12, 586, 269, 025 
7 | 12, 586, 269, 025 20, 365, 011, 074 


It can also be verified that 12,586,269,025 is divisible by 101 and that when 7,778,742,049 and 20,365,011,074 are divided by 
101, the remainder is 1. 


(a) Show that every point in S of the form (#2 / 101, x / 101) returns to its starting position after 25 iterations under Arnold's 
cat map. 


(b) Show that every point in S of the form (#2 / 101, / 101) has period 1, 5, or 25. 


(с) Show that the point [mr 0) has period greater than 5 by iterating it five times. 


(d) Show that I1(101) = 25. 


Answer: 

(c) = 1 i i 2 3 5 83/(13 21 (34 55 | 
The first five iterates of (qr. 0) are [1 TOT) (101° tor (101° 101^ (1017 101 2 (101° For | 

* Show that for the mapping 7: § — S defined by T(x, у) = |x + m >) mod 1, every point in S is a periodic point. Why does 


this show that the mapping is not chaotic? 


An Anosov automorphism on R2 is a mapping from the unit square 5 onto S of the form 


PE abr 


in which (i) a, b, c, and d are integers, (11) the determinant of the matrix is + 1, and (iii) the eigenvalues of the matrix do not 
have magnitude 1. It can be shown that all Anosov automorphisms are chaotic mappings. 


(a) Show that Arnold's cat map is an Anosov automorphism. 


(b) Which of the following are the matrices of an Anosov automorphism? 
0 1 3-2 10 
1 0[ 117 jo 17 
5 7 6 2 
23| |52 
(c) Show that the following mapping of S onto S is not an Anosov automorphism. 


ЖШ 


What is the geometric effect of this transformation оп S? Use your observation to show that the mapping is not a chaotic 
mapping by showing that all points in S are periodic points. 


Answer: 


(b) The matrices of Anosov automorphisms are E A and [ J 


(c) The transformation affects a rotation of S through 90° in the clockwise direction. 


7. Show that Arnold's cat map is one-to-one over the unit square S and that its range is S. 
8. Show that the inverse of Arnold's cat map is given by 
C(x, y) = Qx y, —x +y) mod 1 


9. Show that the unit square S can be partitioned into four triangular regions on each of which Arnold's cat map is a 


transformation of the form 
x] [1 1|[х a 
| ly oly + b 


where a and b need not be the same for each region. [Hint: Find the regions in 5 that map onto the four shaded regions of the 
parallelogram in Figure 10.14.14.] 


Answer: 


(0, 1) (1, 1) (0.1) (1/2,1) (1.1) 


ES G-E ЕЕЕ 


(0, 0) (1,0) (0.0) (1/2,0) (1,0) 


1 corte d Е ion te | |= 0. onm: | | m T LE ionty: 412 | 71 
n region I: | , | = 0 ;inregion IE | , | = 21 ; in region IE: | , | = E ; in region IV: | , | = E 
10. If (xo, Уй) is a point in S and (ху, Ум) is its nth iterate under Arnold's cat map, show that 
Xn 1 1]"[xo 
kal z f А MI ! 
This result implies that the modular arithmetic need only be performed once rather than after each iteration. 


11. Show that (0, 0) is the only fixed point of Arnold's cat map by showing that the only solution of the equation 


ШЫНЫ 


12. 


13. 


14. 


with 0 < xg < 1 and 0 < yg < 118 xg = yg = 0. [Hint: For appropriate nonnegative integers, r and s, we can write 
zoj |1 1j|*0|. 5] 
yoj |1 2]|X0] Ls 

Find all 2-cycles of Arnold's cat map by finding all solutions of the equation 


Pol= [i 2] Doe: 


with 0 < xg < 1 and Q < yg < 1. [Hint: For appropriate nonnegative integers, r and s, we can write 


5] [3 sbl- 


for the preceding equation. ] 


for the preceding equation. ] 


Answer: 
13 42 Я 2 1 2 " 
| З | апа E Д q form one 2-cycle, and (2 : i and Е : 5) form another 2-cycle. 


Show that every periodic point of Arnold's cat map must be a rational point by showing that for all solutions of the equation 


хо 1 1T"[xo 
[58| Е f 4 MI ! 
the numbers xg and ур are quotients of integers. 


Let T be the Arnold's cat map applied five times in a row; that is, 7 — TŽ. Figure Ex-14 represents four successive mappings 


of Т on the first image, each image having a resolution of 101 x 101 pixels. The fifth mapping returns to the first image 
because this cat map has a period of 25. Explain how you might generate this particular sequence of images. 


Figure Ex-14 


Answer: 


Begin with a 101 x 101 array of white pixels and add the letter ‘A’ in black pixels to it. Apply the mapping to this image, 
which will scatter the black pixels throughout the image. Then superimpose the letter ‘B’ in black pixels onto this image. 
Apply the mapping again and then superimpose the letter ‘C’ in black pixels onto the resulting image. Repeat this procedure 
with the letters ‘D’ and ‘E’. The next application of the mapping will return you to the letter ‘A’ with the pixels for the letters 
‘B’ through *E' scattered in the background. 


Section 10.14 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, 
Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with some 
linear algebra capabilities. For each exercise you will need to read the relevant documentation for the particular utility you are 
using. The goal of these exercises is to provide you with a basic proficiency with your technology utility. Once you have 
mastered the techniques in these exercises, you will be able to use your technology utility to solve many of the problems in the 
regular exercise sets. 


T1. The methods of Exercise 4 show that for the cat map, II( p) is the smallest integer satisfying the equation 


This suggests that one way to determine [I( p) is to compute 


starting with з — ] and stopping when this produces the identity matrix. Use this idea to compute [I( p) for p = 2, 3, ..., 10. 
Compare your results to the formulas given in Exercise 1, if they apply. What can you conjecture about 


1 
f | mo ts 
1 2 
when [I( p) is even? 


T2. The eigenvalues and eigenvectors for the cat map matrix 
Td 
Cz 


y= 3+5, \ 3-45. 


2 2 2 
1 1 


v= 1+y5 > V2=!/ 1-5 
2 2 


Using these eigenvalues and eigenvectors, we can define 


34 5 j 


are 


1 1 
2 
D= and Р= | 1 l= 
ЯЕ 1545 1-05 
0 2 2 
2 
and write Œ = ppp 71; hence, œ” — РД" Р—1. Use a computer to show that 
Q» QD 
» |11 ©12 
Q QD 
D ME 
where 
so) _ [1e 5 M 3- 5 "_(1-үз\їз+үз\ 
11 2/5 2 2/5 2 
„% [105 3 4 5 " (1-45Ms-y5Y 
22 2/5 2 2/5 2 
and 


а че 


012 50y = /5 2 


How can you use these results and your conclusions in Exercise T1 to simplify the method for computing II( p)? 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


10.15 Cryptography 


In this section we present a method of encoding and decoding messages. We also examine modular arithmetic and show 
how Gaussian elimination can sometimes be used to break an opponent's code. 


Prerequisites 


Matrices 

Gaussian Elimination 
Matrix Operations 
Linear Independence 


Linear Transformations (Section 4.9) 


Ciphers 


The study of encoding and decoding secret messages is called cryptography. Although secret codes date to the earliest days 
of written communication, there has been a recent surge of interest in the subject because of the need to maintain the 
privacy of information transmitted over public lines of communication. In the language of cryptography, codes are called 
ciphers, uncoded messages are called plaintext, and coded messages are called ciphertext. The process of converting from 
plaintext to ciphertext is called enciphering, and the reverse process of converting from ciphertext to plaintext is called 
deciphering. 


The simplest ciphers, called substitution ciphers, are those that replace each letter of the alphabet by a different letter. For 
example, in the substitution cipher 


Plan A B CD EF GH I JK LM NOPQR S TUV WX Y 7 
Cipher D E F G H 1I JK L M NOP QR S TU V W X Y 7 AB C 


the plaintext letter А is replaced by D, the plaintext letter B by E, and so forth. With this cipher the plaintext message 
ROME WAS NOT BUILT IN A DAY 
becomes 


URPH ZDV ОКИ EX LOW LQ D GDB 


Hill Ciphers 


A disadvantage of substitution ciphers is that they preserve the frequencies of individual letters, making it relatively easy to 
break the code by statistical methods. One way to overcome this problem is to divide the plaintext into groups of letters and 
encipher the plaintext group by group, rather than one letter at a time. A system of cryptography in which the plaintext 15 
divided into sets of л letters, each of which 15 replaced by a set of n cipher letters, is called a polygraphic system. In this 
section we will study a class of polygraphic systems based on matrix transformations. [The ciphers that we will discuss are 
called Hill ciphers after Lester S. Hill, who introduced them in two papers: “Cryptography in an Algebraic Alphabet,” 
American Mathematical Monthly, 36 (June-July 1929), pp. 306-312; and “Concerning Certain Linear Transformation 
Apparatus of Cryptography,” American Mathematical Monthly, 38 (March 1931), pp. 135—154.] 


In the discussion to follow, we assume that each plaintext and ciphertext letter except Z is assigned the numerical value that 
specifies its position in the standard alphabet (Table 1). For reasons that will become clear later, Z is assigned a value of 
Zero. 


Table 1 


A BC DE F С Н I J К LM NOP QR ST. UVW X Y Z@ 


123 4 5 6 7 8 9 10 11 12 13 I4 15 16 17 18 19 20 21 22 23 24 25 0 


In the simplest Hill ciphers, successive pairs of plaintext are transformed into ciphertext by the following procedure: 


ац d12 
А= 
p a 


to perform the encoding. Certain additional conditions on A will be imposed later. 


Step 1 Choose a 2 « 2 matrix with integer entries 


Step 2 Group successive plaintext letters into pairs, adding an arbitrary “dummy” letter to fill out the last pair if the 
plaintext has an odd number of letters, and replace each plaintext letter by its numerical value. 


Step 3 Successively convert each plaintext pair P1722 into a column vector 


_ |21 
Р |р; 
and form the product Ар. We will call р а plaintext vector and Ар the corresponding ciphertext vector. 


Step 4 Convert each ciphertext vector into its alphabetic equivalent. 


EXAMPLE 1 Hill Cipher of a Message + 


12 
0 3 
to obtain the Hill cipher for the plaintext message 
IAM HIDING 


Use the matrix 


Solution If we group the plaintext into pairs and add the dummy letter G to fill out the last pair, we obtain 
ТА MH ID IN GG 
or, equivalently, from Table 1, 
91 138 94 914 77 


To encipher the pair /4, we form the matrix product 
12|9! |11 
0 3]}1 3 
which, from Table 1, yields the ciphertext KC. 
To encipher the pair MH, we form the product 
125]13| |29 
03] 8 24 


However, there is a problem here, because the number 29 has no alphabet equivalent (Table 1). To resolve 
this problem, we make the following agreement: 


Whenever an integer greater than 25 occurs, it will be 
replaced by the remainder that results when this 
integer is divided by 26. 


Because the remainder after division by 26 is one of the integers 0, 1, 2, ..., 25, this procedure will always 
yield an integer with an alphabet equivalent. 


Thus, in 1 we replace 29 by 3, which is the remainder after dividing 29 by 26. It now follows from Table 1 
that the ciphertext for the pair MH 1s CX. 


The computations for the remaining ciphertext vectors are 


BIE 
И-й 
dp] - EI 


These correspond to the ciphertext pairs QL, KP, and UU, respectively. In summary, the entire ciphertext 
message is 


| 
r= 
— = 
~] м ~) 
| 


KC CX QL KP UU 


which would usually be transmitted as a single string without spaces: 


KCCXQLKPUU 


Because the plaintext was grouped in pairs and enciphered by a 2 x 2 matrix, the Hill cipher in Example | is referred to as a 
Hill 2-cipher. It is obviously also possible to group the plaintext in triples and encipher by a 3 x 3 matrix with integer 
entries; this is called a Hill 3-cipher. In general, for a Hill n-cipher, plaintext is grouped into sets of n letters and 
enciphered by an у x » matrix with integer entries. 


Modular Arithmetic 


In Example 1, integers greater than 25 were replaced by their remainders after division by 26. This technique of working 
with remainders is at the core of a body of mathematics called modular arithmetic. Because of its importance in 
cryptography, we will digress for a moment to touch on some of the main ideas in this area. 


In modular arithmetic we are given a positive integer m, called the modulus, and any two integers whose difference is an 
integer multiple of the modulus are regarded as “equal” or “equivalent” with respect to the modulus. More precisely, we 
make the following definition. 


DEFINITION 1 


If m is a positive integer and a and b are any integers, then we say that a is equivalent to b modulo m, written 
a=b (mods) 


ifa — b is an integer multiple of m. 


EXAMPLE 2 Various Equivalences +3 


7 = 2 (mod5) 
19 = 3 (mod2) 
25 (mod 26) 
0  (mod4) 


_ | 
Noe 
| Il 


For any modulus m it can be proved that every integer a is equivalent, modulo m, to exactly one of the integers 
0, 1, 2,.. т — 1 
We call this integer the residue of a modulo m, and we write 
Zm = {0, 1, 2, ..., 72 — 1) 
to denote the set of residues modulo m. 


If a is a nonnegative integer, then its residue modulo m is simply the remainder that results when a is divided by m. For an 
arbitrary integer a, the residue can be found using the following theorem. 


THEOREM 10.15.1 


For any integer a and modulus m, let 
R=remainder of Hl 
Then the residue r of a modulo m is given by 
R ifa>o 
r—im-—R ifa=<0 and R#0 
0 if ac0 and R=0 


EXAMPLE 3 Residuesmod 26 + 
Find the residue modulo 26 of (a) 87, (b) —38, and (c) —26. 


Solution 
(a) Dividing |87| = 87 by 26 yields a remainder of R = 9, so к = 9. Thus, 
87=9 (mod 26) 
(b) Dividing | — 38| = 38 by 26 yields a remainder of R = 12, so r = 26 — 12 = 14. Thus, 
=38 = 14 (mod 26) 
(c) Dividing | — 26| = 26 by 26 yields a remainder of R = 0. Thus, 
=26=0 (mod 26) 


In ordinary arithmetic every nonzero number a has a reciprocal or multiplicative inverse, denoted Бу 47, such that 


1 


aa | =a! 


а= 1 
In modular arithmetic we have the following corresponding concept: 


DEFINITION 2 


If a is a number in Zm, then a number g -lin Zy is called a reciprocal or multiplicative inverse of a modulo m if 


aa! =a la- l(mod м). 


It can be proved that if a and m have no common prime factors, then a has a unique reciprocal modulo m; conversely, if a 
and m have a common prime factor, then a has no reciprocal modulo m. 


EXAMPLE 4 Reciprocalof 3 mod26 << 


The number 3 has a reciprocal modulo 26 because 3 and 26 have no common prime factors. This reciprocal 
can be obtained by finding the number x in 236 that satisfies the modular equation 


3x —1 (mod 26) 


Although there are general methods for solving such modular equations, it would take us too far afield to 
study them. However, because 26 is relatively small, this equation can be solved by trying the possible 
solutions, 0 to 25, one at a time. With this approach we find that x — 9 is the solution, because 


3.9—27—1 (mod 26) 
Thus, 
31.9 (mod 26) 


EXAMPLE 5 ANumber with No Reciprocal mod 26 + 


The number 4 has no reciprocal modulo 26, because 4 and 26 have 2 as a common prime factor (see Exercise 
8). 


For future reference, in Table 2 we provide the following reciprocals modulo 26: 


Table 2 Reciprocals Modulo 26 


Deciphering 


Every useful cipher must have a procedure for decipherment. In the case of a Hill cipher, decipherment uses the inverse 
(mod 26) of the enciphering matrix. To be precise, if m is a positive integer, then a square matrix A with entries in Z, is 
said to be invertible modulo m if there is a matrix B with entries in Z,, such that 


AB—BA-—I (modom) 


211 12 
А= 
E | 
is invertible modulo 26 and this matrix is used in a Hill 2-cipher. If 


Р\ 
р || (1) 


c— Ар (mod 26) 


Suppose now that 


is a plaintext vector, then 


is the corresponding ciphertext vector and 
p—4 e (mod26) 
Thus, each plaintext vector can be recovered from the corresponding ciphertext vector by multiplying it on the left by 
-i 
A (mod 26). 


In cryptography it is important to know which matrices are invertible modulo 26 and how to obtain their inverses. We now 
investigate these questions. 


In ordinary arithmetic, a square matrix А is invertible if and only if det{ A) # 0, or, equivalently, if and only if det(.4) has a 
reciprocal. The following theorem is the analog of this result in modular arithmetic. 


THEOREM 10.15.2 


A square matrix A with entries in Z,, is invertible modulo m if and only if the residue of det(.4) modulo m has a 
reciprocal modulo т. 


Because the residue of det(.4) modulo m will have a reciprocal modulo m if and only if this residue and m have no common 
prime factors, we have the following corollary. 


COROLLARY 10.15.3 


A square matrix A with entries іп Z,, is invertible modulo m if and only if m and the residue of det(A) modulo m 
have no common prime factors. 


Because the only prime factors of р; = 26 are 2 and 13, we have the following corollary, which is useful in cryptography. 


COROLLARY 10.15.4 


A square matrix A with entries in Z5g is invertible modulo 26 if and only if the residue of det(A) modulo 26 is not 
divisible by 2 or 13. 


We leave it for you to verify that 1f 


- 


has entries in 236 and the residue of det(A) = ad — bc modulo 26 is not divisible by 2 or 13, then the inverse of A (mod 
26) 1s given by 


A71 = (ad — bc) Al d =] (mod 26) 
—с а 
where (ad — bc) —l is the reciprocal of the residue of gg — bc (mod 26). 


EXAMPLE 6 Inverse of a Matrix mod 26 <@ 


Find the inverse of 


modulo 26. 


Solution 
ае 4) — ad — bc —5-:3—6-2—3 
so from Table 2, 


(ad —hc) | —31—9 (mod 26) 


ap 38] op ж 54] [ж 
j =| _; = [is ME | (mod 26) 


a [5 6][1 24] [53 234] [1 0 
Aa = = = d26 
|; 8 | E | [ | и 


Thus, from 2, 


As a check, 


Similarly, 471 4 = /. 


EXAMPLE 7 Decoding a Hill 2-Cipher <4 


Decode the following Hill 2-cipher, which was enciphered by the matrix in Example 6: 
GINKGKDUSK 


Solution From Table 1 the numerical equivalent of this ciphertext is 
720 1411 711 421 1911 


To obtain the plaintext pairs, we multiply each ciphertext vector by the inverse of А (obtained in Example 6): 


Q) 


|s A H HIRE (mod 26) 
[e iiia] = [ [=] ®°%%® 
[Spb] = B-E єч» 
|: Е Z НЕН (mod 26) 
E oli = EAE (mod 26) 


From Table 1, the alphabet equivalents of these vectors are 
ST RI KE NO WW 
which yields the message 


SIRIKE NOW 


Breaking a Hill Cipher 


Because the purpose of enciphering messages and information is to prevent “opponents” from learning their contents, 
cryptographers are concerned with the security of their ciphers—that is, how readily they can be broken (deciphered by 
their opponents). We will conclude this section by discussing one technique for breaking Hill ciphers. 


Suppose that you are able to obtain some corresponding plaintext and ciphertext from an opponent's message. For example, 
on examining some intercepted ciphertext, you may be able to deduce that the message 15 a letter that begins DEAR SIR. We 
will show that with a small amount of such data, it may be possible to determine the deciphering matrix of a Hill code and 
consequently obtain access to the rest of the message. 


It is a basic result in linear algebra that a linear transformation is completely determined by its values at a basis. This 
principle suggests that 1f we have a Hill n-cipher, and if 


рі, P2. --- ри 
are linearly independent plaintext vectors whose corresponding ciphertext vectors 


Api, Арз, .... AP» 


are known, then there is enough information available to determine the matrix A and hence A - (mod m). 


The following theorem, whose proof is discussed in the exercises, provides a way to do this. 


THEOREM 10.15.5 Determining the Deciphering Matrix 


Let pj, рз, ..., Py be linearly independent plaintext vectors, and let с, сз, ..., Cy be the corresponding ciphertext 
vectors in a Hill n-cipher. If 


is the » x м matrix with row vectors pi, рї, mS р] and if 


is the x x 4; matrix with row vectors cd, c -— ci , then the sequence of elementary row operations that reduces C 


to / transforms P to ( 4 =" Ji 


This theorem tells us that to find the transpose of the deciphering matrix 4 71, we must find a sequence of row operations 


that reduces C to Гапа then perform this same sequence of operations on P. The following example illustrates a simple 
algorithm for doing this. 


EXAMPLE 8 Using Theorem 10.15.5 < 


The following Hill 2-cipher is intercepted: 


IOSBTGE ESPX HOPDE 
Decipher the message, given that it starts with the word DEAR. 


Solution From Table 1, the numerical equivalent of the known plaintext is 


DE AR 
45 118 

and the numerical equivalent of the corresponding ciphertext 1s 
IO SB 

915 192 


so the corresponding plaintext and ciphertext vectors are 


We want to reduce 


T 
C= i 28 | 9 | 
"rl [19 2 
vi 
to / by elementary row operations and simultaneously apply these operations to 


T 
р—|!|_|&4 5 
[pr] [118 


to obtain (A z1) Т (the transpose of the deciphering matrix). This can be accomplished by adjoining P to the 
right of C and applying row operations to the resulting matrix [C'|P] until the left side is reduced to /. The 


T 
final matrix will then have the form [7 | (А = ]. The computations can be carried out as follows: 


9 15 4 5 : 
E 5 | 1 a — We formed the matrix [С |Р]. 
1 45 12 15 UT eb. 
E 5 | 1 | — We multiplied the first row by 9° =3. 
1 19 12 15 i , 
E > | 1 H <— We replaced 45 by its residue modulo 26 . 
1 19 12 15 
k _359 | 225] 2 + We added — 19 times the first row to the second . 
1 19 12 15 ies -— 
0 5 7 19 <— We replaced the entries in the second row by their residues modulo 26 . 
1 19 12 15 X af. 
l 1 | 147 x: — We multiplied the second row by 5 ^ —21. 
1 19 12 15 INS TES" 
0 1 17 9 + We replaced the entries in the second row by their residues modulo 26 . 
e. тт — We added — 19 times the second row to the first . 
0 1 17 9 
10 1 0 К, А 
| 0 1 | 17 ; <— We replaced the entries in the first row by their residues modulo 26 . 
Thus, 
p 10 
Ad) = 
EP 
so the deciphering matrix is 
2 1 17 
Аі = 
los] 


To decipher the message, we first group the ciphertext into pairs and find the numerical equivalent of each 
letter: 


IO SB TG AE SP XH OP DE 

915 192 207 245 1916 248 1516 45 
Next, we multiply successive ciphertext vectors on the left by 4-71 and find the alphabet equivalents of the 
resulting plaintext pairs: 


b sls] = 6] z 
НЕБЕ 
b Sa] = [n] z 
E- t, 
[o ollie] = [a] x 
lo S] = [al 7 
iq - (2) $ 


DH 


Finally, we construct the message from the plaintext pairs: 


DE AR IK ES EN DT AN KS 
DEARIKE SEND TANKS 


Further Readings 


Readers interested in learning more about mathematical cryptography are referred to the following books, the first 
of which is elementary and the second more advanced. 


1. Abraham Sinkov, Elementary Cryptanalysis, a Mathematical Approach (Mathematical Association of America, 2009). 


2. Alan G. Konheim, Cryptography, a Primer (New York: Wiley-Interscience, 1981). 


Exercise Set 10.15 


1. Obtain the Hill cipher of the message 
DARK NIGHT 


for each of the following enciphering matrices: 


(a) |1 3 
2 1 
(5 [4 3 
12 


(a) GIYUOKEVBH 
(b) SEANEFZWJH 


2. In each part determine whether the matrix 1s invertible modulo 26. If so, find its inverse modulo 26 and check your work 
by verifying that 447! = 4-1 4 = / (mod 26). 


"e: 
(c) а= | 
9-1 
EH 
EE 
Answer 
ЭМ 


(b) Not invertible 


(с) 4,31. 1 18 
ü =|; M 


(d) Not invertible 
(e) Not invertible 


(ser. T1532 
x = 3 | 


3. Decode the message 
SAKNOX AOJAX 
given that it is a Hill cipher with enciphering matrix 
4 1 
[5 2 


Answer: 


WE LOVE MATH 


4. A Hill 2-cipher is intercepted that starts with the pairs 
Sa HK 
Find the deciphering and enciphering matrices, given that the plaintext is known to start with the word ARMY. 


Answer: 


I 7 15| NE "Wen E E 
Deciphering паш = | 7 j enciphering mati =| 7 xi 


5. Decode the following Hill 2-cipher if the last four plaintext letters are known to be ATOM. 


LNGIHGYBVRENJYQO 


Answer: 
THEY SPLIT THE ATOM 
6. Decode the following Hill 3-cipher if the first nine plaintext letters are IHAVECOME: 
HPAFQGGDUGDDHPGODYFNOR 
Answer: 


I HAVE COME TO BURY CAESAR 


m 


. All of the results of this section can be generalized to the case where the plaintext 1s a binary message; that is, it is a 
sequence of 0's and 1's. In this case we do all of our modular arithmetic using modulus 2 rather than modulus 26. Thus, 
for example, 1 + 1 = 0 (mod 2). Suppose we want to encrypt the message 110101111. Let us first break it into triplets to 


1 1 1 110 
form the three vectors | 1], | O |, | 1 |, and let us take | 0 1 1 | as our enciphering matrix. 
0 1 1 1 ile Л 


(a) Find the encoded message. 


(b) Find the inverse modulo 2 of the enciphering matrix, and verify that it decodes your encoded message. 
Answer: 


(a) 010110001 


œ [0 1 1 
111 
101 


8. If, in addition to the standard alphabet, a period, comma, and question mark were allowed, then 29 plaintext and 
ciphertext symbols would be available and all matrix arithmetic would be done modulo 29. Under what conditions 
would a matrix with entries in Z49 be invertible modulo 29? 


Answer: 


A is invertible modulo 29 if and only if det(.4) # 0 (mod 29). 

9. Show that the modular equation 4x = 1 (mod 26) has no solution in 276 by successively substituting the values 
RH Os 1:2,.4525: 

10. (a) Let P and C be the matrices in Theorem 10.15.5. Show that P=C(A = T 


(b) To prove Theorem 10.15.5, let #1, #3, ..., Ej be the elementary matrices that correspond to the row operations that 
reduce C to J, so 


Ву. БЕС =i 
Show that 
EE 
E,.EQEQP—(A ) 


from which it follows that the same sequence of row operations that reduces C to J converts P to (А y " 
H. (a) IfA is the enciphering matrix of a Hill n-cipher, show that 


=] cT T 
А -—(C P) (mod26) 
where C and P are the matrices defined in Theorem 10.15.5. 


(b) Instead of using Theorem 10.15.5 as in the text, find the deciphering matrix 4 1 of Example 8 by using the result in 
part (a) and Equation 2 to compute (7 71. [Note: Although this method is practical for Hill 2-ciphers, Theorem 
10.15.5 is more efficient for Hill n-ciphers with x => 2.] 


Section 10.15 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, Mathematica, 
Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a scientific calculator with 
some linear algebra capabilities. For each exercise you will need to read the relevant documentation for the particular 
utility you are using. The goal of these exercises is to provide you with a basic proficiency with your technology utility. 
Once you have mastered the techniques in these exercises, you will be able to use your technology utility to solve many of 
the problems in the regular exercise sets. 


T1. Two integers that have no common factors (except 1) are said to be relatively prime. Given a positive integer л, let 
бу = (21,402, 43, -- 2m}, Where d © 42 © d3 <... < Am, be the set of all positive integers less than n and relatively 
prime to n. For example, if x = 9, then 


So = (a,1,a3,83,.., 46) = {1, 2,4, 5,7, 8) 


(a) Construct a table consisting of n and S, for » = 2, 3, ..., 15, and then compute 
m m 
Уак and (= Z (mod x) 
к=1 =! 


in each case. Draw a conjecture for y > 15 and prove your conjecture to be true. [Hint: Use the fact that if a is 
relatively prime to n, then у = g is also relatively prime to и.] 


(b) Given a positive integer n and the set Sp, let P, be the jz хи matrix 
4] 42 43 .. m-i т 
аз аз ад... äm a 
Baez] ® “4 a5 a, a 
Gm—] Gm 01 Gm—3 @m—2 
am 4] d2 Gm—2 @т—1 
so that, for example, 
124578 
24578 1 
Po= 457812 
ane oe 24 
781245 
812457 
Use a computer to compute det(P,,) and det(P,,) (тойм) for z = 2, 3, ..., 15, and then use these results to construct a 


conjecture. 


(c) Use the results of part (a) to prove your conjecture to be true. [Hint: Add the first p; — 1 rows of P, to its last row and 
then use Theorem 2.2.3.] What do these results imply about the inverse of P, (тойм)? 


T2. Given a positive integer n greater than 1, the number of positive integers less than n and relatively prime to л is called 

the Euler phi function of n and is denoted by р(м). For example, 26) = 2 since only two positive integers (1 and 5) аге 

less than 6 and have no common factor with 6. 

(a) Using a computer, for each value of » = 2, 3, ..., 25 compute and print out all positive integers that are less than n and 
relatively prime to л. Then use these integers to determine the values of роѓи) for z = 2, 3,..., 25. Can you discover a 
pattern in the results? 


(b) It can be shown that if ( p1, P2, P3, -... Pm} are all the distinct prime factors of n, then 


ЕЕЕ 


For example, since (2, 3} are the distinct prime factors of 12, we have 
(12) = es du 
«a =12[1 jf 1) 4 


which agrees with the fact that (1, 5, 7, 11) are the only positive integers less than 12 and relatively prime to 12. 
Using a computer, print out all the prime factors of n for» = 2, 3,..., 25. Then compute (з) using the formula above 
and compare it to your results in part (a). 


Copyright © 2010 John Wiley & Sons, Inc. All rights reserved. 


10.16 Genetics 


In this section we investigate the propagation of an inherited trait in successive generations by computing 
powers of a matrix. 


Prerequisites 


Eigenvalues and Eigenvectors 
Diagonalization of a Matrix 


Intuitive Understanding of Limits 


Inheritance Traits 


In this section we examine the inheritance of traits in animals or plants. The inherited trait under consideration 
is assumed to be governed by a set of two genes, which we designate by 4 and a. Under autosomal 
inheritance each individual in the population of either gender possesses two of these genes, the possible 
pairings being designated АА, Aa, and aa. This pair of genes is called the individual's genotype, and it 
determines how the trait controlled by the genes is manifested in the individual. For example, in snapdragons 
a set of two genes determines the color of the flower. Genotype АА produces red flowers, genotype Аа 
produces pink flowers, and genotype aa produces white flowers. In humans, eye coloration is controlled 
through autosomal inheritance. Genotypes АА and aa have brown eyes, and genotype Aa has blue eyes. In this 
case we say that gene A dominates gene a, or that gene a is recessive to gene A, because genotype Aa has the 
same outward trait as genotype АА. 


In addition to autosomal inheritance we will also discuss X-linked inheritance. In this type of inheritance, the 
male of the species possesses only one of the two possible genes (А or a), and the female possesses a pair of 
the two genes (АА, aa, or Aa). In humans, color blindness, hereditary baldness, hemophilia, and muscular 
dystrophy, to name a few, are traits controlled by X-linked inheritance. 


Below we explain the manner in which the genes of the parents are passed on to their offspring for the two 
types of inheritance. We construct matrix models that give the probable genotypes of the offspring in terms of 
the genotypes of the parents, and we use these matrix models to follow the genotype distribution of a 
population through successive generations. 


Autosomal Inheritance 


In autosomal inheritance an individual inherits one gene from each of its parents' pairs of genes to form its 
own particular pair. As far as we know, it is a matter of chance which of the two genes a parent passes on to 
the offspring. Thus, if one parent is of genotype Aa, it is equally likely that the offspring will inherit the А 


gene or the a gene from that parent. If one parent is of genotype aa and the other parent is of genotype Aa, the 
offspring will always receive an a gene from the aa parent and will receive either an А gene or an a gene, with 
equal probability, from the Аа parent. Consequently, each of the offspring has equal probability of being 
genotype aa or Aa. In Table 1 we list the probabilities of the possible genotypes of the offspring for all 
possible combinations of the genotypes of the parents. 


Table 1 


Genotypes of Parents 


ECHEHBERERERERES 


EXAMPLE 1 Distribution of Genotypes in a Population + 


Suppose that a farmer has a large population of plants consisting of some distribution of all 
three possible genotypes АА, Aa, and aa. The farmer desires to undertake a breeding program in 
which each plant in the population is always fertilized with a plant of genotype АА and is then 
replaced by one of its offspring. We want to derive an expression for the distribution of the 
three possible genotypes in the population after any number of generations. 


For x» = 0, 1, 2,..., let us set 


dy, = fraction of plants of genotype AA in z th generation 


ce 
х 
| 


fraction of plants of genotype Аа in м th generation 

су = fraction of plants of genotype aa in м th generation 
Thus 4, Ро, and со specify the initial distribution of the genotypes. We also have that 
Gy + by су = 1 forz = 0, 1, 2,... 


From Table 1 we can determine the genotype distribution of each generation from the genotype 
distribution of the preceding generation by the following equations: 


ay = 4-1 Lon- 
b, = сы + Ton- adus (1) 
сы = б 


For example, the first of these three equations states that all the offspring of a plant of genotype 
АА will be of genotype АА under this breeding program and that half of the offspring of a plant 
of genotype Aa will be of genotype АА. 


Equations 1 can be written in matrix notation as 


x) = Мх), 4—1,2,... (2) 
where 
1 
Gy än- 1 2 i 
о р a ы Бү and Bl 1 
Cn Сп] 2 
00 0 


Note that the three columns of the matrix M are the same as the first three columns of Table 1. 
From Equation 2 it follows that 
x? = Mx?) = Mx?» =e м”? (3) 


Consequently, if we can find an explicit expression for M", we can use 3 to obtain an explicit 
expression for х. To find an explicit expression for №”, we first diagonalize M. That is, we 
find an invertible matrix P and a diagonal matrix D such that 


M=Ppp (4) 


With such a diagonalization, we then have (see Exercise 1) 


M" = PD"P— for » = 1, 2. ... 


where 
3 00..0] |21 00... 0 
pra} o 42 0... 0| _|0 мо... 0 
0 0 0... X ооо... А 


The diagonalization of M is accomplished by finding its eigenvalues and corresponding 
eigenvectors. These are as follows (verify): 


Eigenvalues: Ab. AS 1. i= 0 
1 1 1 
Corresponding eigenvectors: vi —|0|, уз=|—1|, уз=|—2 
0 0 1 
Thus, in Equation 4 we have 
X00 100 
Dalo № 0|-|0 i 0 


and 


1 1 1 
P-[vilv3lv3] =| 0 -1 =—2 
0 1 
Therefore, 
1 0 0 
1 1 1l " 1 4 1 || 40 
x= pphp-y0 = |0 -1 —2||0 B» ollo —1 —2||&o 
0 0 1 0 0 110 
0 0 0 
Or 
1 n 1 n—l 
ay i 1- (5) 1-0) ай 
y = b, = 0 a n 1 »n-1 bp 
Cn 2 2 ep) 
0 0 0 
1\” 1 n—1 
ag + 20 +eco- [5] №0 = (2) co 
= n и—1 
ш з 
0 
Using the fact that ag + bg + cg = 1, we thus have 
D” 1 и—1 
an = 1- (3) ю-() а 
n и—1 n=], 2,... (5) 
m= е 7з 
сы = б 


These are explicit formulas for the fractions of the three genotypes in the nth generation of 
plants in terms of the initial genotype fractions. 


n 
Because i2) tends to zero as л approaches infinity, it follows from these equations that 
йу — 1 
b, — 0 
£y = 0 


as n approaches infinity. That is, in the limit all plants in the population will be genotype АА. 


EXAMPLE 2 Modifying Example 1 + 


We can modify Example 1 so that instead of each plant being fertilized with one of genotype 
AA, each plant is fertilized with a plant of its own genotype. Using the same notation as in 
Example 1, we then find 


x? — any O 
where 
i 
1 4 0 
м=|0 1 0 
2 
1 
0 4 1 


The columns of this new matrix M are the same as the columns of Table 1 corresponding to 
parents with genotypes AA—AA, Aa—Aa, апа аа-аа. 


The eigenvalues of M are (verify) 


Med xs M=} 


The eigenvalue Aj = 1 has multiplicity two and its corresponding eigenspace is 
two-dimensional. Picking two linearly independent eigenvectors v, and єз in that eigenspace, 


and a single eigenvector V3 for the simple eigenvalue Аз = > we have (verify) 


1 0 1 
уу=|0|, у2=|0|, жз = | —2 
0 1 1 


The calculations for х? are then 


x? = М" — ppnp-yO 


1 lo 
е flea. a : ap 
= |00 =2 „|0 5 11/40 
o1 1]00{%) t5 
2) jlo -1 o 
2 
1 pop] 
Е 0 
1 sik 
= |0 |5) о || bg 


Thus, 


| 
R 
e 
| 
| 
———— 
ho) 
a 
= 
E 
— 
LLL 
e 
e 


Gy 
1 n 
b. = 2) bo ж=1,2 (6) 
n+l 
n = 12-0] 
n 
In the limit, as n tends to infinity, (3 — 0 and (1 —, 0, SO 
1 
dy — ау 220 
b, — 0 
1, 
Cy — ср 2 0 


Thus, fertilization of each plant with one of its own genotype produces a population that in the 
limit contains only genotypes АА and aa. 


Autosomal Recessive Diseases 


There are many genetic diseases governed by autosomal inheritance in which a normal gene А dominates an 
abnormal gene a. Genotype АА is a normal individual; genotype Aa is a carrier of the disease but is not 
afflicted with the disease; and genotype aa is afflicted with the disease. In humans such genetic diseases are 
often associated with a particular racial group—for instance, cystic fibrosis (predominant among Caucasians), 
sickle-cell anemia (predominant among people of African origin), Cooley's anemia (predominant among 
people of Mediterranean origin), and Tay-Sachs disease (predominant among Eastern European Jews). 


Suppose that an animal breeder has a population of animals that carries an autosomal recessive disease. 
Suppose further that those animals afflicted with the disease do not survive to maturity. One possible way to 
control such a disease is for the breeder to always mate a female, regardless of her genotype, with a normal 
male. In this way, all future offspring will either have a normal father and a normal mother (44—44 matings) 
or a normal father and a carrier mother (44—4a matings). There can be no AA—aa matings since animals of 
genotype aa do not survive to maturity. Under this type of mating program no future offspring will be 
afflicted with the disease, although there will still be carriers in future generations. Let us now determine the 
fraction of carriers in future generations. We set 


a 
х = NI n—1,2,... 


n 
where 
йу = fraction of population of genotype AA in z th generation 
b, = fraction of population of genotype Aa (carriers) in м th generation 


Because each offspring has at least one normal parent, we may consider the controlled mating program as one 


of continual mating with genotype Aa, as in Example 1. Thus, the transition of genotype distributions from 
one generation to the next is governed by the equation 


x? — Mx? D. ж= 1523 


where 


М = 


mle pole 


0 


Because we know the initial distribution х), the distribution of genotypes in the nth generation is thus given 
by 
x) — м"х®, 2n —1,2,... 


The diagonalization of M is easily carried out (see Exercise 4) and leads to 


1 0 
б) _ п»-1„@_|1 1! »|1 1]||40 
x POP =x { E 0 (3) là zx dan 


Because ag + bg = 1, we have 


i A122 (7) 
b,= [=] Р 
Thus, as л tends to infinity, we have 
a, —1 
b, —0 
so in the limit there will be no carriers in the population. 
From 7 we see that 
221 МЕ 
by = ~by-], n=], д (8) 


2 


That is, the fraction of carriers in each generation is one-half the fraction of carriers in the preceding 
generation. It would be of interest also to investigate the propagation of carriers under random mating, when 
two animals mate without regard to their genotypes. Unfortunately, such random mating leads to nonlinear 
equations, and the techniques of this section are not applicable. However, by other techniques it can be shown 
that under random mating, Equation 8 15 replaced by 


by = — Эй 102... © 


As a numerical example, suppose that the breeder starts with a population in which 10% of the animals are 
carriers. Under the controlled-mating program governed by Equation 8, the percentage of carriers can be 
reduced to 5% in one generation. But under random mating, Equation 9 predicts that 9.5% of the population 
will be carriers after one generation (b,, = .095 if b,,, = .10). In addition, under controlled mating no 
offspring will ever be afflicted with the disease, but with random mating it can be shown that about 1 in 400 
offspring will be born with the disease when 10% of the population are carriers. 


X-Linked Inheritance 


As mentioned in the introduction, in X-linked inheritance the male possesses one gene (4 or a) and the female 
possesses two genes (44, Aa, or aa). The term X-/inked is used because such genes are found on the 
X-chromosome, of which the male has one and the female has two. The inheritance of such genes is as 
follows: A male offspring receives one of his mother's two genes with equal probability, and a female 
offspring receives the one gene of her father and one of her mother's two genes with equal probability. 
Readers familiar with basic probability can verify that this type of inheritance leads to the genotype 
probabilities in Table 2. 


Table 2 


Genotypes of Parents (Father, Mother) 


E 
& 

= 

> 
= 

© 


We will discuss a program of inbreeding in connection with X-linked inheritance. We begin with a male and 
female; select two of their offspring at random, one of each gender, and mate them; select two of the resulting 
offspring and mate them; and so forth. Such inbreeding is commonly performed with animals. (Among 
humans, such brother-sister marriages were used by the rulers of ancient Egypt to keep the royal line pure.) 


The original male-female pair can be one of the six types, corresponding to the six columns of Table 2: 
(А, АА), (А, Aa), (A, aa), (a, АА), (a, Aa), (a, aa) 

The sibling pairs mated in each successive generation have certain probabilities of being one of these six 

types. To compute these probabilities, for x = 0, 1, 2,..., let us set 


dy, = probability sibling-pair mated in z th generation is type (4, AA) 


b, = probability siblng-par mated in » th generation 15 type (A, Аа) 
Cy = probability sibling-pair mated in » th generation 15 type (A, aa) 
d, = probability sibling-pair mated in z th generation is type (a, AA) 
ёу = probability siblng-par mated in » th generation 15 type (a, Aa) 
Jn = probability sibling-pair mated in » th generation 15 type (a, aa) 
With these probabilities we form a column vector 
Gy 
by 
x) — is m=O, 1,2; ... 
ey 
Án 
From Table 2 it follows that 
x) = My"7D. я=—1,2,... (10) 


where 


(А, АА) (A, Aa) (A, аа) (a, АА) (а, Аа) (а, аа) 


1 


0 


0 


0 


0 


0 


о ын |н 


о ын |н 


0 0 


0 1 


0 0 


0 0 


10 


0 0 


0 


о ын |н 


| | 


? H4, A4) 
0 (A, Aa) 
0 | (A, аа) 
0 | (a, АА) 
o | (a, 4а) 


(a, aa) 


For example, suppose that in the (»: — 1)-st generation, the sibling pair mated is type (А, Аа). Then their 
male offspring will be genotype A or a with equal probability, and their female offspring will be genotype АА 
or Аа with equal probability. Because one of the male offspring and one of the female offspring are chosen at 
random for mating, the next sibling pair will be one of type (A, AA), (A, Aa), (a, AA), or (a, Aa) with 


equal probability. Thus, the second column of M contains 


four sibling pairs. (See Exercise 9 for the remaining columns.) 


ec 33 


As in our previous examples, it follows from 10 that 


in each of the four rows corresponding to these 


x) = м"”х®, я=1,2,... (11) 


After lengthy calculations, the eigenvalues and eigenvectors of M turn out to be 


EN E = 1, м= - 1, = 10+ y5), №= 10-5) 


1 0 —1 1 
0 0 2 —6 
у —|9| 10111,13 
tap ae] Yer xU 
0 0 —2 6 
0 1 1 —1 
ic3-45 10-3+ 5) 
1 1 
1(-1+09) ici-/5 
v5— 1 ‚6 = 1 
4C 1e 5 gie Г) 
1 1 
40-3-05) i345 
The diagonalization of M then leads to 
x = pp'pyO ,—1 2... (12) 


where 


10-1 1000-3-05) 1(-3+ү5) 
бй ob 1 1 
00 -1 -3 i-1«/5 ici-/5 


P = 

00 1 3 i-1«/9 ici-/5 

0-0 xo 3 1 1 

01 1-1 ic-3-/»5 i(-3+ y5) 

10 0 0 0 0 

021. 0 0 0 0 
i" 

0 0 (2) 0 0 0 

n 
р" = |00 o [=] 0 0 
n 
00 0 0 ETE] 0 
n 

00 0 0 0 ТОР] 
2 1 2 1 

1 3 3 3 3 Р 
1 2 L 2 

9 3 3 3 3 | 
2 ER! d: zd 

paz 0 8 4 4 8 : 

жой, хи ш ees 

24 12 12 24 


0 apt (5 is 
1 1 1 1 
0 356-49 -i45 -i45 55-05) 0 


We will not write out the matrix product in 12, as it is rather unwieldy. However, if a specific vector y® is 
given, the calculation for x) is not too cumbersome (see Exercise 6). 


Lalo 
vn] 

th 

s|- 
"м 
ыл 
— 
л] 
"at^ 
о 


Because the absolute values of ће last four diagonal entries of D are less than 1, we see that as п tends to 
infinity, 


р", 


ооо о о н 
ооо о н © 
оо о ©» co & 
ооо о о о 
ооо о о о 
ооо о о о 


And so, from Equation 12, 


ооо о н о 
oo oc о Бо 
oo oc о о 
ооо о о 


0 
Performing the matrix multiplication on the right, we obtain (verify) 


О ye 1 


ag + 320+ уса + 200+ 220 
0 
QD. 0 
x н (13) 
0 
Ibr enp аа 
Jot 320 | 3^0 | 340 | 320 


That is, in the limit all sibling pairs will be either type (А, AA) or type (a, aa). For example, if the initial 
parents are type (A, Aa) (that is, bp = 1 and ag = eg = dg = eg = f = 0), then as n tends to infinity, 


SS, 
Wh > oO © o ur» 


Thus, in the limit there is probability 2 that the sibling pairs will be (А, АА), and probability i that they will 


be (a, aa). 


Exercise Set 10.16 


1. Show that if jy — ppp-1, then M" = PDPP! forn = 1,2, .... 


2. In Example 1 suppose that the plants are always fertilized with a plant of genotype Аа rather than one of 
genotype АА. Derive formulas for the fractions of the plants of genotypes АА, Aa, and aa in the nth 
generation. Also, find the limiting genotype distribution as л tends to infinity. 


Answer: 


n+l 
an=} і 2) (ag — co) аһ -i 
b =d nzl2..b5 za as и + co 
n 2 .4,- n > * 
n+l m 
e-1- (1) (ag — co) nta 


. In Example 1 suppose that the initial plants are fertilized with genotype АА, the first generation is 
fertilized with genotype Aa, the second generation is fertilized with genotype АА, and this alternating 
pattern of fertilization is kept up. Find formulas for the fractions of the plants of genotypes АА, Aa, and aa 
in the nth generation. 


Answer: 

an+ = 2 } TE (240 — bg — 4с0) 

bnti = 3 ут Qao — bo — 4e) n = 0), 1, 2. 35: 
Cln+1 = 

2 1 _ 
an = 12 + дуя (220 7 20 7 420) 
by = i п= 1,2, 
1 


. In the section on autosomal recessive diseases, find the eigenvalues and eigenvectors of the matrix M and 
verify Equation 7. 


Answer: 


Eigenvalues: Ду = 1, Аз = 1, eigenvectors: еј = Bi ез = E 

. Suppose that a breeder has an animal population in which 25% of the population are carriers of an 
autosomal recessive disease. If the breeder allows the animals to mate irrespective of their genotype, use 
Equation 9 to calculate the number of generations required for the percentage of carriers to fall from 25% 
to 1095. If the breeder instead implements the controlled-mating program determined by Equation 8, what 
will the percentage of carriers be after the same number of generations? 


Answer: 


12 generations; .006% 


. In the section on X-linked inheritance, suppose that the initial parents are equally likely to be of any of the 
six possible genotype parents; that is, 


Ale Ale Ale Ale Ale Alo 


Using Equation 12, calculate х and also calculate the limit of x“ as п tends to infinity. 


Answer: 
1,1 1 n+l n+l 
2*3 qal 732-490 95) + (-3 4 50-745 7] 
i 25 [1+ {л + Gas)" 
| n n 

$13 en [1-4 y5) 0-45 ] 

x = Е 
1 1 n n Й 
3 pr + /5 +1- y5) ] 
3` 2 [14 ne " a-/45"*5 
Eu ы. | n+l n+l 
2*3 quilC737 9049 ^ (3450-749 71 
1 
2 
0 

x? — | as и — OO 
0 
1 
2 


7. From 13 show that under X-linked inheritance with inbreeding, the probability that the limiting sibling 
pairs will be of type (4, AA) is the same as the proportion of A genes in the initial population. 


8. In X-linked inheritance suppose that none of the females of genotype Aa survive to maturity. Under 
inbreeding the possible sibling pairs are then 


CA, AA), (A, аа), (a, AA), and (a, aa) 


Find the transition matrix that describes how the genotype distribution changes in one generation. 


Answer: 


о о о н 
о о о о 
о о о о 
= о 00 


9. Derive ће matrix M in Equation 10 from Table 2. 
Section 10.16 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be 
MATLAB, Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra 
software or a scientific calculator with some linear algebra capabilities. For each exercise you will need to 
read the relevant documentation for the particular utility you are using. The goal of these exercises 15 to 
provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


T1. 


(a) Use a computer to verify that the eigenvalues and eigenvectors of 


1 
120000 
t 1 
050140 
0000210 

M=| | 
010000 
L 1 
011010 
1 
0ооо 1 


as given in the text are correct. 
(b) Starting with 409 — jr 0t—D and the assumption that 
lim x? =x 
n—ug 
exists, we must have 
lim х0) = M ша x?) or x = Mx 
noo “Do 


This suggests that x can be solved directly using the equation ( M — Гух = 0. Use a computer to solve the 
equation x = Mx, where 


AROR 


and a +è +e +d Бе + f = 1; compare your results to Equation 13. Explain why the solution to 
(М — Гух = 0 along with a + è +c +d е -- f£ = 1 is not specific enough to determine „п x), 
эсс 


T2. 
(a) Given 
10 -1 1 
00 2 — 
00-1 =3 
P= 
00 1 3 
00-2 6 
0 1 1 =1 
from Equation 12 and 
lm D” = 
n—oao 
use a computer to show that 
lm M"— 
n—2oa 


(b) Use a computer to calculate Af” for »; — 10, 20, 30, 40, 50, 


limit in part (a). 
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ic3-495 }(-3+ү5) 
1 1 
icie/5 ici-/5 
icie45 ici-/5 
1 1 
ic3-495 ici» 
100000 
010000 
000000 
00000 0 
0000 0 0 
0о0оооо 
20r 
: BSc. 3-3 9 
000000 
00-0 0 00 
D. rep cnr cn di 
0 0v0 0.0 0 
1.3 1:9 
х А С ! 


60, 70, and then compare your results to the 


10.17 Age-Specific Population Growth 


In this section we investigate, using the Leslie matrix model, the growth over time of a female population that 
is divided into age classes. We then determine the limiting age distribution and growth rate of the population. 


Prerequisites 


Eigenvalues and Eigenvectors 
Diagonalization of a Matrix 


Intuitive Understanding of Limits 


One of the most common models of population growth used by demographers is the so-called Leslie model 
developed in the 1940s. This model describes the growth of the female portion of a human or animal 
population. In this model the females are divided into age classes of equal duration. To be specific, suppose 
that the maximum age attained by any female in the population is L years (or some other time unit) and we 
divide the population into n age classes. Then each class is 7, / ; years in duration. We label the age classes 
according to Table 1. 


Table 1 


Age Class Age Interval 


[0, L/ n) 
[L/ n. 2L/ n) 
[2L / n. 3L/ n) 


[(n — 2)L/ n. (n — D)L/ n) 
[п — DL/ n. L] 


Suppose that we know the number of females in each of the n classes at time ¢ — (). In particular, let there be 
x0 females in the first class, х0 females in the second class, and so forth. With these » numbers we form a 


column vector: 


We call this vector the initial age distribution vector. 


As time progresses, the number of females within each of the n classes changes because of three biological 
processes: birth, death, and aging. By describing these three processes quantitatively, we will see how to 
project the initial age distribution vector into the future. 


The easiest way to study the aging process is to observe the population at discrete times— say, 


£p, £1, £2, ..., £g, .... The Leslie model requires that the duration between апу two successive observation 
times be the same as the duration of the age intervals. Therefore, we set 

fj; = 0 

Бү же. їп 

fg = A2LÍn 

fy = &Lin 


With this assumption, all females in the {i + 1)-st class at time ¢;, ү were in the ith class at time £5. 


The birth and death processes between two successive observation times can be described by means of the 
following demographic parameters: 


The average number of daughters 
born to each female during the 
time she is in the ith age class 


The fraction of females in the ith 
age class that can be expected to 
survive and pass into the (i +1)-st 


age class 


By their definitions, we have that 
(Da; 2 0 bril1356 
()0-—b;€1 fori—l2,..,»—1 
Note that we do not allow any 4, to equal zero, because then no females would survive beyond the ith age 


class. We also assume that at least one 4; is positive so that some births occur. Any age class for which the 
corresponding value of à; is positive is called a fertile age class. 


We next define the age distribution vector x? at time ¢ k by 


where x0 is the number of females in the ith age class at time £y. Now, at time £}, the females in the first age 


class are just those daughters born between times ¢;,_; апа? к. Thus, we can write 


number of number of number of 


daughters daughters daughters 
number of 
born to born to born to 
females . . f 
А females іп + females in “ы females їп 
in class] bei fap | 
аиша tk class | class ' class n 
between times between times between times 
£k] and £i їр] and £& £k | and £i 
or, mathematically, 
К к—1 к—1 k—1 
x арх ) Lagi РЕР ) (1) 


The females in the (i + 1)-st age class (i = 1, 2,..., м — 1) at time £; are those females in the ith class at 
time £x. ., who are still alive at time £z. Thus, 


fraction of 

number of females in number of 

females in A class i females in 

classi-- l| \ who survive class i 

at time £i and pass into | | at time £y. 1 
class i + 1 

or, mathematically, 
К k-i Я 
ae — > i=1,2,..,.2—1 (2) 


Using matrix notation, we can write Equations | and 2 as 


KC К 
l а] 42 аз .. än- ay ||! 
| Ja, 0 0 о 0 |507 
x0 =|0 b 0 : 0 a ne: 
i 0. 0-05 uai 0 
(0) "m 67D 
n n 


or more compactly as 


x) —ry*-D E212... (3) 
where L is the Leslie matrix 
й] dj аз n-li ĉn 
b1^ D^. 0 0 


L-|0 b 0... 0 0 (4) 


From Equation 3 it follows that 


Thus, if we know the initial age distribution x and the Leslie matrix L, we can determine the female age 


distribution at any later time. 


Lx 
Lx =i 2,0 
Ix =13х® 


Іх) —1*х® 


EXAMPLE 1 Female Age Distribution for Animals + 


Suppose that the oldest age attained by the females in a certain animal population is 15 years 
and we divide the population into three age classes with equal durations of five years. Let the 


Leslie matrix for this population be 


If there are initially 1000 females in each of the three age classes, then from Equation 3 we 


have 
1, 000 
x® = |1,000 
1, 000 
xD = 1х®= 
x) = Іх = 
x9 = Ê= 


© ml © о ml © 


© ml © 


o A 
UJ 


о ml o 


4 3 
1,000) [7.000 
0 0 |1, 000 |=] sop 
1 0 111,000 250 
4 
4 
7,000 2, 750 
0 0 
500 | — | 3, 500 
1 250 125 
"m 
4 3 
Bon 2, 750 14, 375 
3.500|—| 1,375 
dt 125 875 
4 0 


Thus, after 15 years there are 14,375 females between 0 and 5 years of age, 1375 females 
between 5 and 10 years of age, and 875 females between 10 and 15 years of age. 


(5) 


Limiting Behavior 


Although Equation 5 gives the age distribution of the population at any time, it does not immediately give a 
general picture of the dynamics of the growth process. For this we need to investigate the eigenvalues and 
eigenvectors of the Leslie matrix. The eigenvalues of L are the roots of its characteristic polynomial. As we 
ask you to verify in Exercise 2, this characteristic polynomial is 


РО) = W-ž] 


= А”— ajA”! — azb An? — azb 53A? = ...— aybib3.. Ру] 


To analyze the roots of this polynomial, it will be convenient to introduce the function 


b bib bibo b, 
400 = 5 + EL e HEIDE SL (6) 


Using this function, the characteristic equation p(À) = 0 can be written (verify) 
@(А)=1 forAz0 (7) 


Because all the 2; and 5; are nonnegative, we see that (А) is monotonically decreasing for A greater than 
zero. Furthermore, g (А) has a vertical asymptote at  — () and approaches zero as А — ox». Consequently, as 
Figure 10.17.1 indicates, there is a unique Д, say А = Aj, such that g (А) = 1. That is, the matrix L has a 
unique positive eigenvalue. It can also be shown (see Exercise 3) that Ду has multiplicity 1; that is, Ay is nota 
repeated root of the characteristic equation. Although we omit the computational details, you can verify that 
an eigenvector corresponding to Дү is 


1 
by fay 
bibat A? 


xj = 
bibab3 IAT 


(8) 


bib, by fat! 


Because A; has multiplicity 1, its corresponding eigenspace has dimension 1 (Exercise 3), and so any 
eigenvector corresponding to it is some multiple of X1. We can summarize these results in the following 
theorem. 


ФА) 


Figure 10.17.1 


THEOREM 10.17.1 Existence of a Positive Eigenvalue 


A Leslie matrix L has a unique positive eigenvalue Ај. This eigenvalue has multiplicity 7 and an 
eigenvector Xj all of whose entries are positive. 


We will now show that the long-term behavior of the age distribution of the population is determined by the 
positive eigenvalue A, and its eigenvector X1. In Exercise 9 we ask you to prove the following result. 


THEOREM 10.17.2 Eigenvalues of a Leslie Matrix 


If Ay is the unique positive eigenvalue of a Leslie matrix Z, and Aj is any other real or complex 
eigenvalue of £, then |Ak| € Aj. 


For our purposes the conclusion in Theorem 10.17.2 is not strong enough; we need À to satisfy |Aj| < Ay. In 
this case Ду would be called the dominant eigenvalue of L. However, as the following example shows, not all 
Leslie matrices satisfy this condition. 


EXAMPLE 2 Leslie Matrix with No Dominant Eigenvalue + 


Let 
0 0 6 
1 
1-15 0 0 
1 
0 3 0 


Then the characteristic polynomial of L is 
р(А) = h-z] =А3—1 


The eigenvalues of L are thus the solutions of 4? — 1—namely, 
xa, -1+ї% _1_%З, 
: 2 " 2 2 


All three eigenvalues have absolute value 1, so the unique positive eigenvalue A; = 1 is not 
dominant. Note that this matrix has the property that 73 — 7. This means that for any choice of the 


initial age distribution x D, we have 
xD =O 24,0 = 4.00. — 


The age distribution vector thus oscillates with a period of three time units. Such oscillations (or 


population waves, as they are called) could not occur if Ay were dominant, as we will see below. 


It is beyond the scope of this book to discuss necessary and sufficient conditions for Ду to be a dominant 
eigenvalue. However, we will state the following sufficient condition without proof. 


THEOREM 10.17.3 Dominant Eigenvalue 


If two successive entries 4; and @;-„1 in the first row of a Leslie matrix L are nonzero, then the 
positive eigenvalue of L is dominant. 


Thus, if the female population has two successive fertile age classes, then its Leslie matrix has a dominant 
eigenvalue. This is always the case for realistic populations if the duration of the age classes is sufficiently 
small. Note that in Example 2 there is only one fertile age class (the third), so the condition of Theorem 
10.17.3 is not satisfied. In what follows, we always assume that the condition of Theorem 10.17.3 is satisfied. 


Let us assume that L is diagonalizable. This is not really necessary for the conclusions we will draw, but it 
does simplify the arguments. In this case, L has n eigenvalues, Aj, Az, ..., Ау, not necessarily distinct, and n 
linearly independent eigenvectors, ху, X2, ..., x, corresponding to them. In this listing we place the dominant 
eigenvalue À first. We construct a matrix P whose columns are the eigenvectors of L: 

P= [xixa|xs | [Хи] 


The diagonalization of L is then given by the equation 


м 0 0.. 0 
pop? м 0—0] 
0 0 0... Ay 


From this it follows that 


ооо... А 
fork = 1, 2,.... For any initial age distribution vector x, we then have 
Moo... 0 
pO p 0 MO... 0 |p-LQO 
ооо... X 


for k = 1, 2, .... Dividing both sides of this equation by AT and using the fact that у — у *у ©, we have 


Ра (9) 


X k 
0 0 0. i3 
AY 
Because À, is the dominant eigenvalue, we have |A; / À1| < 1 for? = 2, 3, ..., я. It follows that 
(А; IAD —Üask— oo fori=2,3,.., 2 
Using this fact, we can take the limit of both sides of 9 to obtain 


100... 0 
im ,-1х®,=р|9 © © -. 0р0 (10) 
кос A P f d 1 

000... 0 


Let us denote the first entry of the column vector P 1x) by the constant c. As we ask you to show in 


Exercise 4, the right side of 10 can be written as Сх], where c is a positive constant that depends only on the 
initial age distribution vector х. Thus, 10 becomes 


i el = ску (11) 
Equation 11 gives us the approximation 
x® mw eM xy (12) 
for large values of k. From 12 we also have 
x*-D ~ СА lx, (13) 
Comparing Equations 12 and 13, we see that 
x9 e. Ax 0 (14) 
for large values of k. This means that for large values of time, each age distribution vector is a scalar multiple 
of the preceding age distribution vector, the scalar being the positive eigenvalue of the Leslie matrix. 


Consequently, the proportion of females in each of the age classes becomes constant. As we will see in the 
following example, these limiting proportions can be determined from the eigenvector х]. 


EXAMPLE 3 Example 1 Revisited — 


The Leslie matrix in Example 1 was 


0 4 3 
rele 00 
o io 
Its characteristic polynomial is p(À) — a) $ and you can verify that the positive 


eigenvalue is Ау = ER From 8 the corresponding eigenvector Xj is 
g 1=% р 5 eig 


C 
C 
юы 
-— 
P d 
a. 
| 
m |— 
I 


From 14 we have 
x9 36-0 


for large values of k. Hence, every five years ће number of females in each of the three classes 
will increase by about 50%, as will the total number of females in the population. 


From 12 we have 


Consequently, eventually the females will be distributed among the three age classes in the ratios 
kie. This corresponds to a distribution of 72% of the females in the first age class, 24% of the 
females in the second age class, and 4% of the females in the third age class. 


EXAMPLE 4 Female Age Distribution for Humans + 


In this example we use birth and death parameters from the year 1965 for Canadian females. 
Because few women over 50 years of age bear children, we restrict ourselves to the portion of the 
female population between 0 and 50 years of age. The data are for 5-year age classes, so there are a 
total of 10 age classes. Rather than writing out the 10 x 10 Leslie matrix in full, we list the birth 
and death parameters as follows: 


[0. 5) 0.00000 | 0.99651 
[5. 10) 0.00024 0.99820 
[10, 15) | 0.05861 | 0.99802 
[15,20) | 0.28608 | 0.99729 


[20,25) | 0.44791 | 0.99694 
[25.30) | 0.36399 | 0.99621 
[30,35) | 0.22259 | 0.99460 
(35,40) | 0.10457 | 0.99184 
[40, 45) | 0.02826 | 0.98700 
[45, 50) 0.00240 — 


Using numerical techniques, we can approximate the positive eigenvalue and corresponding 
eigenvector by 


1.00000 
0.92594 
0.85881 
0.79641 
0.73800 
0.68364 
0.63281 
0.58482 
0.53897 
0.49429 


№ = 1.07622 and ху = 


Thus, if Canadian women continued to reproduce and die as they did in 1965, eventually every 5 
years their numbers would increase by 7.622%. From the eigenvector Xj, we see that, in the limit, 
for every 100,000 females between 0 and 5 years of age, there will be 92,594 females between 5 
and 10 years of age, 85,881 females between 10 and 15 years of age, and so forth. 


Let us look again at Equation 12, which gives the age distribution vector of the population for large times: 
х® ES (15) 


Three cases arise according to the value of the positive eigenvalue Дү: 
() The population is eventually increasing if Ay > 1. 
(ш) The population is eventually decreasing if Ay < 1. 
(ш) The population eventually stabilizes if Ay = 1 . 
The case Aj = 1 is particularly interesting because it determines a population that has zero population 


growth. For any initial age distribution, the population approaches a limiting age distribution that is some 
multiple of the eigenvector х. From Equations 6 and 7, we see that Aj = 1 is an eigenvalue if and only if 


ay +4201 + 32122 +... + ay, b... b, 1—1 (16) 
The expression 
R-—a,- agb, + a3b453 +... + ay... Бу] (17) 


is called the net reproduction rate of the population. (See Exercise 5 for a demographic interpretation of R.) 
Thus, we can say that a population has zero population growth if and only if its net reproduction rate is 1. 


Exercise Set 10.17 


1. Suppose that a certain animal population is divided into two age classes and has a Leslie matrix 
12 
L= 
= ip 
2 
(a) Calculate the positive eigenvalue A; of L and the corresponding eigenvector Х]. 


(b) Beginning with the initial age distribution vector 
«y. [100 
x == 
E 
calculate x (D, х0), xO, х, and у), rounding off to the nearest integer when necessary. 


(c) Calculate x using the exact formula х — уу) and using the approximation formula y (S) zy Aix 


Answer: 


1 
M-$. xX; = d 
3 


(b) = [1 20 „|17 | 9 = |226) 9 = [s 9 | 
(c) 46 MORN |857 || еу | 89 
x Lx Е ‚ X Aix " 


2. Find the characteristic polynomial of a general Leslie matrix given by Equation 4. 


3. (a) Show that the positive eigenvalue A, of a Leslie matrix is always simple. Recall that a root Ag ofa 
polynomial @ (А) is simple if and only if 2 ' (Ag) # 0. 


(b) Show that the eigenspace corresponding to A; has dimension 1. 
4. Show that the right side of Equation 10 is 7X1, where c is the first entry of the column vector p 710. 


5. Show that the net reproduction rate R, defined by 17, can be interpreted as the average number of 
daughters born to a single female during her expected lifetime. 


6. Show that a population is eventually decreasing if and only if its net reproduction rate is less than 1. 
Similarly, show that a population is eventually increasing if and only if its net reproduction rate is greater 
than 1. 


7. Calculate the net reproduction rate of the animal population in Example 1. 
Answer: 


2.375 


8. (For readers with a hand calculator) Calculate the net reproduction rate of the Canadian female 
population in Example 4. 


Answer: 


1.49611 


9. (For readers who have read Section 10.1—Section 10.3) Prove Theorem 10.17.2. [Hint: Write № = roð, 
substitute into 7, take the real parts of both sides, and show that r < ДЛ]. 


Section 10.17 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be 
MATLAB, Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra 
software or a scientific calculator with some linear algebra capabilities. For each exercise you will need to 
read the relevant documentation for the particular utility you are using. The goal of these exercises 15 to 
provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


T1. Consider the sequence of Leslie matrices 


" 0 0 a 
Г = pt i L42|&à| 0 URN 
bi 0 
0 b 0 
000 0 a 
0 0 0 а à0000 
уо 0 0 
Lie > %=|0 550 0 OF, 
0 b 0 0 
би з 0 0 40 0 
a ооо b40 


(a) Use a computer to show that 
Bah, Bah, Lf=l, Li-ls.. 
for a suitable choice of a in terms of by, 55, ..., by. 1. 


(b) From your results in part (a), conjecture a relationship between a and b4, 55, ..., b. ., that will make 
ЁЛ = where 


ee 
= 
© 
© 
^» Д OO 
«о о DR 


(c) Determine an expression for p, (X) = An — Z| and use it to show that all eigenvalues of Z, satisfy 
|A| = 1 when a and b, 55, ..., Ёо аге related by the equation determined in part (b). 


T2. Consider the sequence of Leslie matrices 


b 0 
0b O 
2 
а ap ap^ ap 
L42 5 0 0 0 
05 0 0 
00 b 0 
2 
a ap ap” ар ap 
bo 0 0 о 
Ls=|0 >» 0 о 0 
00 b о 0 
00 0 b 0 
a ap ар? ‚ ap"? ар"! 
ьо 0 0 0 
In=|0 b 0 0 0 
00 b 0 0 
00 0 b 0 


where 0 <р<1,0<5< 1, and ] ga. 
(a) Choose a value for n (say, y — 8). For various values of a, b, and p, use a computer to determine the 
dominant eigenvalue of £,,, and then compare your results to the value ofa + bp. 


(b) Show that 


n n 
ри(А) = |, — Z4| =A" a Exi 


A= bp 


which means that the eigenvalues of £,, must satisfy 
AHI (а + bp)A" --a(bp)" =0 
(c) Can you now provide a rough proof to explain the fact that À == a + bp? 


T3. Suppose that a population of mice has a Leslie matrix L over a 1-month period and an initial age 


distribution vector х) given by 


143 

Же ус Ss ao Н 

Аз 0. 0 й 
5 50 
023-0000 an 
10 (0) 30 
L= 9 and xX” =| g 
00 2—0 0 0 x 
4 5 

000200 

22 

6 0. 0 0-2-0 


(a) Compute the net reproduction rate of the population. 

(b) Compute the age distribution vector after 100 months and 101 months, and show that the vector after 101 
weeks is approximately a scalar multiple of the vector after 100 months. 

(c) Compute the dominant eigenvalue of L and its corresponding eigenvector. How are they related to your 
results in part (b)? 

(d) Suppose you wish to control the mouse population by feeding it a substance that decreases its age-specific 
birthrates (the entries in the first row of L) by a constant fraction. What range of fractions would cause the 
population eventually to decrease? 
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10.18 Harvesting of Animal Populations 


In this section we employ the Leslie matrix model of population growth to model the sustainable harvesting 
of an animal population. We also examine the effect of harvesting different fractions of different age groups. 


Prerequisites 


Age-Specific Population Growth (Section 10.17) 


Harvesting 


In Section 10.17 we used the Leslie matrix model to examine the growth of a female population that was 
divided into discrete age classes. In this section, we investigate the effects of harvesting an animal population 
growing according to such a model. By harvesting we mean the removal of animals from the population. 
(The word harvesting is not necessarily a euphemism for “slaughtering”; the animals may be removed from 
the population for other purposes.) 


In this section we restrict ourselves to sustainable harvesting policies. By this we mean the following: 


DEFINITION 1 


A harvesting policy in which an animal population is periodically harvested is said to be sustainable 
if the yield of each harvest is the same and the age distribution of the population remaining after each 
harvest is the same. 


Thus, the animal population is not depleted by a sustainable harvesting policy; only the excess growth 15 
removed. 


As in Section 10.17, we will discuss only the females of the population. If the number of males in each age 
class is equal to the number of females—a reasonable assumption for many populations—then our harvesting 
policies will also apply to the male portion of the population. 


The Harvesting Model 


Figure 10.18.1 illustrates the basic idea of the model. We begin with a population having a particular age 
distribution. It undergoes a growth period that will be described by the Leslie matrix. At the end of the growth 
period, a certain fraction of each age class is harvested in such a way that the unharvested population has the 


same age distribution as the original population. This cycle repeats after each harvest so that the yield is 
sustainable. The duration of the harvest is assumed to be short in comparison with the growth period so that 
any growth or change in the population during the harvest period can be neglected. 


Population before growth period Population after growth period 


rf rf rf n0 n0 ч : "э ч т et ee 
"чч ot) etetetetet 


айа ый 


Not harvested 


efet 
ad 
eret et 


Figure 10.18.1 


Population 


harvested Harvested 


To describe this harvesting model mathematically, let 

х1 

х2 

хи 

be the age distribution vector of ће population at the beginning of ће growth period. Thus X; is the number 
of females in the ith class left unharvested. As in Section 10.17, we require that the duration of each age class 


be identical with the duration of the growth period. For example, if the population is harvested once a year, 
then the population is divided into 1-year age classes. 


If L is the Leslie matrix describing the growth of the population, then the vector Zy is the age distribution 
vector of the population at the end of the growth period, immediately before the periodic harvest. Let й;, for 
i= 1, 2,..., 4, be the fraction of females from the ith class that is harvested. We use these n numbers to form 


ап y x 4; diagonal matrix 
ky 0 0... 0 
0 А 0 .. 0 
Н=|0 0 kz... 0 
0 0 0 .. hy 
which we will call the harvesting matrix. By definition, we have 
0<#й;,<1 G=1,2,...,%) 
That is, we can harvest попе (й; = 0), all (2; = 1), or some fraction (0 < й; < 1) of each of the n classes. 


Because the number of females in the ith class immediately before each harvest is the ith entry (Zx); of the 
vector jx, the ith entry of the column vector 


hix) 
Hix= hx) 


йу (Lx) = 


is the number of females harvested from the ith class. 


From the definition of a sustainable harvesting policy, we have 


age distribution age distribution 
at end of — [harvest] = | at beginning of 
growth period growth period 
or, mathematically, 
ix = Ніх = х (1) 


If we write Equation 1 in the form 
(—H)x-x (2) 


we see that x must be an eigenvector of the matrix (/ — 4’), corresponding to the eigen- value 1. As we will 
now show, this places certain restrictions on the values of à; and x. 


Suppose that the Leslie matrix of the population is 


4] Q2 Q3 ... dy—] Gy 
61 D. D х <8 0 


£=|0 27 0... 0 0 (3) 
DU d ida 
Then the matrix (7 — А) is (verify) 
(1—54)4 (1= ijan (1—51)a3 ... (1—^))ay-1 (1—^1)as 


(1— h3)b 0 0 E 0 0 
U-H)L- 0 (1— h3)ba 0 m 0 0 
0 0 0 PETRUS 0 


Thus, we see that (7 — УГУЛ, is a matrix with the same mathematical form as a Leslie matrix. In Section 10.17 
we showed that a necessary and sufficient condition for a Leslie matrix to have 1 as an eigenvalue is that its 
net reproduction rate also be 1 [see Eq. 16 of Section 10.17]. Calculating the net reproduction rate of 


(J — H)L and setting it equal to 1, we obtain (verify) 
(1— A1) [a1 +201 (1 — 2) +. asb1b5(1 — 2) (1 — йз) +... а) 
+ aybib. by 1 (1 = 82) (1 – А3)..(1- 4,5] = 1 


This equation places a restriction on ће allowable harvesting fractions. Only those values of й, #3, ..., йу 


that satisfy 4 and that lie in the interval [0, 1] can produce a sustainable yield. 


If 21, #2, ..., йу do satisfy 4, then the matrix (/ — #)Z has the desired eigenvalue Ау = 1. Furthermore, this 
eigenvalue has multiplicity 1, because the positive eigenvalue of a Leslie matrix always has multiplicity 1 
(Theorem 10.17.1). This means that there is only one linearly independent eigenvector x satisfying Equation 
2. [See Exercise 3(b) of Section 10.17.] One possible choice for x is the following normalized eigenvector: 


1 
b1(l— 53) 
bib3(1 — #2) (1 — йз) 
bibaba(1 — #2) (1 — 43) (1 — ha) 


bib3bs.. by (1 — #2) (1 — 43)...(1 — An) 


Any other solution x of 2 is a multiple of X1. Thus, the vector X1 determines the proportion of females within 
each of the n classes after a harvest under a sustainable harvesting policy. But there is an ambiguity in the 
total number of females in the population after each harvest. This can be determined by some auxiliary 
condition, such as an ecological or economic constraint. For example, for a population economically 
supported by the harvester, the largest population the harvester can afford to raise between harvests would 
determine the particular constant that X, is multiplied by to produce the appropriate vector x in Equation 2. 
For a wild population, the natural habitat of the population would determine how large the total population 
could be between harvests. 


Summarizing our results so far, we see that there is a wide choice in the values of #21, #3, ..., йу, that will 
produce a sustainable yield. But once these values are selected, the proportional age distribution of the 
population after each harvest is uniquely determined by the normalized eigenvector X, defined by Equation 5. 
We now consider a few particular harvesting strategies of this type. 


Uniform Harvesting 


With many populations it is difficult to distinguish or catch animals of specific ages. If animals are caught at 
random, we can reasonably assume that the same fraction of each age class 15 harvested. We therefore set 


А = = И) =...= Йу 
Equation 2 then reduces to (verify) 
1 
ix= 
x | + Кк 
Hence, 1 / (1 — й) must be the unique positive eigenvalue Ај of the Leslie growth matrix L. That is, 
1 
= —— 
D 1-5 


Solving for the harvesting fraction Л, we obtain 
kh=1=—(1/A4) (6) 


The vector X1, in this case, is the same as the eigenvector of L corresponding to the eigenvalue Ay. From 


Equation 8 of Section 10.17, this is 
1 
by fay 
bibat A? 


Xj = 
bibab3 Ї М 


(7) 


bib, by fat! 


From 6 we can see that the larger ^, is, the larger is the fraction of animals we can harvest without depleting 
the population. Note that we need Дү > 1 in order for the harvesting fraction Л to lie in the interval (0, 1). 
This is to be expected, because A; > 1 is the condition that the population be increasing. 


EXAMPLE 1 Harvesting Sheep <& 


For a certain species of domestic sheep in New Zealand with a growth period of 1 year, the 
following Leslie matrix was found (see G. Caughley, “Parameters for Seasonally Breeding 
Populations," Ecology, 48, 1967, pp. 834—839). 


000 .045 .391 472 484 546 .543 .502 468 459 433 421 
845 0 0 0 0 0 0 0 0 0 0 0 


0 975 0 0 0 0 0 0 0 0 0 0 
0 0 .965 0 0 0 0 0 0 0 0 0 
0 0 0 950 0 0 0 0 0 0 0 0 
[m 0 0 0 0 .926 0 0 0 0 0 0 0 
0 0 0 0 0 895 0 0 0 0 0 0 
0 0 0 0 0 о .850 0 0 0 0 0 
0 0 0 0 0 0 о .786 0 0 0 0 
0 0 0 0 0 0 0 0 691 0 0 0 
0 0 0 0 0 0 0 0 0 .561 0 0 
0 0 0 0 0 0 0 0 0 0 .370 0 


The sheep have a lifespan of 12 years, so they are divided into 12 age classes of duration 1 year 
each. By the use of numerical techniques, the unique positive eigenvalue of L can be found to 
be 


Ay = 1.176 
From Equation 6, the harvesting fraction h is 
й = 1= (17А) = 1 (171.176) =.150 


Thus, the uniform harvesting policy is one in which 15.0 % of the sheep from each of the 12 
age classes is harvested every year. From 7 the age distribution vector of the sheep after each 
harvest is proportional to 


1.000 
0.719 
0.596 
0.489 
0.395 
0.311 
0.237 m) 
0.171 
0.114 
0.067 
0.032 
0.010 


х= 


From 8 we see that for every 1000 sheep between 0 and 1 year of age that are not harvested, 
there are 719 sheep between 1 and 2 years of age, 596 sheep between 2 and 3 years of age, and 
so forth. 


Harvesting Only the Youngest Age Class 


In some populations only the youngest females are of any economic value, so the harvester seeks to harvest 
only the females from the youngest age class. Accordingly, let us set 


йү = Ё 
Ag = Из ==...== йу == 0 
Equation 4 then reduces to 
(1—À)(a, --a35, + 34159 -- ...-Faybqb5 Ру) = 1 
ог 
(1—5)А=1 


where R is the net reproduction rate of the population. [See Equation 17 of Section 10.17.] Solving for Л, we 
obtain 


-1-(1/&) (9) 


Note from this equation that a sustainable harvesting policy is possible only if д = 1. This is reasonable 
because only if 2 = 1 is the population increasing. From Equation 5, the age distribution vector after each 
harvest is proportional to the vector 


by 
= 2122 
515223 
bib3bi. by 1 
EXAMPLE 2 Sustainable Harvesting Policy — 
Let us apply this type of sustainable harvesting policy to the sheep population in Example 1. 
For the net reproduction rate of the population we find 
К = aab, +430152 -F aub ba by 
= (.000) + (.045)(.845) +... + (421) (845) (975)...(.370) 
2.514 
From Equation 9, the fraction of the first age class harvested is 
kh=1=(1/8)=1=(1/2.514) 2.602 
From Equation 10, the age distribution of the sheep population after the harvest is proportional 
to the vector 
1.000 
1.000 0.845 
w | | 
845) 0.975 | 
( X ) 0.755 
н (.845)(.975)(.965) _ |0699 
= = (11) 
. 0.626 
0.532 
0.418 
(.845)(.975)...(.370) 0.289 
0.162 
0.060 


A direct calculation gives us the following (see also Exercise 3): 


(10) 


2.514 
0.845 
0.824 
0.795 
0.755 

[ху = | 0-699 (12) 
0.626 
0.532 
0.418 
0.289 
0.162 
0.060 


The vector x; is the age distribution vector immediately before the harvest. The total of all 
entries in x4 is 8.520, so the first entry 2.514 is 29.5% of the total. This means that 
immediately before each harvest, 29.596 of the population is in the youngest age class. Since 
60.2% of this class is harvested, it follows that 17.8% (= 60.2% of 29.5%) of the entire sheep 
population is harvested each year. This can be compared with the uniform harvesting policy of 
Example 1, in which 15.096 of the sheep population is harvested each year. 


Optimal Sustainable Yield 


We saw in Example 1 that a sustainable harvesting policy in which the same fraction of each age class is 
harvested produces a yield of 15.0 % of the sheep population. In Example 2 we saw that if only the youngest 
age class is harvested, the resulting yield is 17.8 % of the population. There are many other possible 
sustainable harvesting policies, and each generally provides a different yield. It would be of interest to find a 
sustainable harvesting policy that produces the largest possible yield. Such a policy is called an optimal 
sustainable harvesting policy, and the resulting yield is called the optimal sustainable yield. However, 
determining the optimal sustainable yield requires linear programming theory, which we will not discuss here. 
We refer you to the following result, which appears in J. К. Beddington and D. B. Taylor, “Optimum Age 
Specific Harvesting of a Population," Biometrics, 29, 1973, pp. 801—809. 


THEOREM 10.18.1 Optimal Sustainable Yield 


An optimal sustainable harvesting policy is one in which either one or two age classes are harvested. 
If two age classes are harvested, then the older age class is completely harvested. 


As an illustration, it can be shown that the optimal sustainable yield of the sheep population is attained when 


hy = 0.522 М 
hg = 1.000 em 


and all other values of Ё; are zero. Thus, 52.2 % of the sheep between 0 and 1 year of age and all the sheep 
between 8 and 9 years of age are harvested. As we ask you to show in Exercise 2, the resulting optimal 
sustainable yield is 19.9 % of the population. 


Exercise Set 10.18 


1. Let a certain animal population be divided into three 1-year age classes and have as its Leslie matrix 


043 
l ġo 

izle 
X 
o lo 


(a) Find the yield and the age distribution vector after each harvest if the same fraction of each of the 
three age classes is harvested every year. 


(b) Find the yield and the age distribution vector after each harvest if only the youngest age class is 
harvested every year. Also, find the fraction of the youngest age class that is harvested. 


Answer: 
(a) 1 
1 1 
Yield — 333% of population; X; — | 3 
A. 
18 
(b) 1 
1 
Yield = 45.8% of population; x; = | 2 |; harvest 57.9% of youngest age class 
a 
8 


2. For the optimal sustainable harvesting policy described by Equations 13, find the vector X, that specifies 
the age distribution of the population after each harvest. Also calculate the vector ix, and verify that the 
optimal sustainable yield is 19.9 % of the population. 


Answer: 


845 845 
824 B24 
795 795 
755 755 
_ | 699 _| 699] 10904418 _ 
х=) с A=] 62) 1594 rr? 
532 532 
0 418 
0 0 
0 0 
0 0 


3. Use Equation 10 to show that if only the first age class of an animal population is harvested 
А-1 
0 
ixy—-xi—| 0 
0 
where Ё is the net reproduction rate of ће population. 
4. If only the ith class of an animal population is to be periodically harvested (7 = 1, 2, ..., м), find the 


corresponding harvesting fraction й у. 


Answer: 


h;—(R—l)/(apibas + bp c c c b agbibas + by 1) 


5. Suppose that all of the Jth class and a certain fraction ё; of the /th class of an animal population is to be 
periodically harvested (1 < 7 < J < я). Calculate йг. 


Answer: 
EE a2b| + * * * + (aj 4154152: + + bg; 5) —1 
арб + "bp b c c c +ay-1b1b2° + c 5g-2 


Section 10.18 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the 
relevant documentation for the particular utility you are using. The goal of these exercises is to provide you 
with a basic proficiency with your technology utility. Once you have mastered the techniques in these 
exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


T1. The results of Theorem 10.18.1 suggest the following algorithm for determining the optimal sustainable 
yield. 


1. For each value of? = 1, 2, ..., м, set ky = й and jj, = 0 for & +; and calculate the respective yields. These 
n calculations give the one-age-class results. Of course, any calculation leading to a value of Л not between 
0 and 1 is rejected. 


2. For each value ofi = 1, 2,.., 4 = 1 and ј =i 1,14 2,..., и, seth; = й, йр = 1, and йу = 0 fork e i, 
j and calculate the respective yields. These in (м — 1) calculations give the two-age-class results. Of 
course, any calculation leading to a value of h not between 0 and 1 is again rejected. 

3. Of the yields calculated in parts (1) and (ii), the largest is the optimal sustainable yield. Note that there will 
be at most 


I agat 
n+ туи 1) = 570+ 1) 


calculations in all. Once again, some of these may lead to a value of h not between 0 and 1 and must 
therefore be rejected. 


If we use this algorithm for the sheep example in the text, there will be at most 2012) (124-1) 2 78 


calculations to consider. Use a computer to do the two-age-class calculations for й = й, 2; = 1, and ky, = 0 
for #1 or j for j = 2, 5,.., 12. Construct a summary table consisting of the values of à, and the 
percentage yields using j = 2, 5,..., 12, which will show that the largest of these yields occurs when j = 9. 


T2. Using the algorithm in Exercise ТІ , do the one-age-class calculations for й; = й and ky; = 0 for ic ж į for 
i= 1, 2,.., 12. Construct a summary table consisting of the values of #; and the percentage yields using 
i= 1, 2,..., 12, which will show that the largest of these yields occurs when ; = 9. 

T3. Referring to the mouse population in Exercise T3 of Section 10.17, suppose that reducing the birthrates 
is not practical, so you instead decide to control the population by uniformly harvesting all of the age classes 
monthly. 


(a) What fraction of the population must be harvested monthly to bring the mouse population to equilibrium 
eventually? 


(b) What is the equilibrium age distribution vector under this uniform harvesting policy? 


(c) The total number of mice in the original mouse population was 155. What would be the total number of 
mice after 5, 10, and 200 months under your uniform harvesting policy? 
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10.19 A Least Squares Model for Human Hearing 


In this section we apply the method of least squares approximation to a model for human hearing. The use of this 
method is motivated by energy considerations. 


Prerequisites 


Inner Product Spaces 
Orthogonal Projection 


Fourier Series (Section 6.6) 


Anatomy of the Ear 


We begin with a brief discussion of the nature of sound and human hearing. Figure 10.19.1 is a schematic diagram 
of the ear showing its three main components: the outer ear, middle ear, and inner ear. Sound waves enter the outer 
ear where they are channeled to the eardrum, causing it to vibrate. Three tiny bones in the middle ear mechanically 
link the eardrum with the snail-shaped cochlea within the inner ear. These bones pass on the vibrations of the 
eardrum to a fluid within the cochlea. The cochlea contains thousands of minute hairs that oscillate with the fluid. 
Those near the entrance of the cochlea are stimulated by high frequencies, and those near the tip are stimulated by 
low frequencies. The movements of these hairs activate nerve cells that send signals along various neural pathways 
to the brain, where the signals are interpreted as sound. 


E. ite. , Cochlea 
| ] is 1 Auditory 
| À ЛЕ | _ пегуе 
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Figure 10.19.1 


The sound waves themselves are variations in time of the air pressure. For the auditory system, ће most 
elementary type of sound wave is a sinusoidal variation in the air pressure. This type of sound wave stimulates the 
hairs within the cochlea in such a way that a nerve impulse along a single neural pathway is produced (Figure 
10.19.2). A sinusoidal sound wave can be described by a function of time 


gli) = Ар + A sin(wt — б) (1) 


where g (2) is the atmospheric pressure at the eardrum, Ag is the normal atmospheric pres-sure, A is the maximum 
deviation of the pressure from the normal atmospheric pressure, w / 27 is the frequency of the wave in cycles per 
second, and ў is the phase angle of the wave. To be perceived as sound, such sinusoidal waves must have 
frequencies within a certain range. For humans this range is roughly 20 cycles per second (cps) to 20,000 cps. 
Frequencies outside this range will not stimulate the hairs within the cochlea enough to produce nerve signals. 


==== 
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Figure 10.19.2 


To a reasonable degree of accuracy, the ear is a linear system. This means that if a complex sound wave is a finite 
sum of sinusoidal components of different amplitudes, frequencies, and phase angles, say, 


g(t) = Ag + A, sin(wi£ — бү) + Ал sin(wat — бэ) +... + Ay sin(wyt — бу) (2) 


then the response of the ear consists of nerve impulses along the same neural pathways that would be stimulated by 
the individual components (Figure 10.19.3). 


iege EX 


Figure 10.19.3 


Let us now consider some periodic sound wave p(t} with period T [i.e., p(£) = p(t + 7) | that is not a finite sum 
of sinusoidal waves. If we examine the response of the ear to such a periodic wave, we find that it is the same as 
the response to some wave that is the sum of sinusoidal waves. That is, there is some sound wave g (£) as given by 
Equation 2 that produces the same response as p(£), even though p (t) and (4) are different functions of time. 


We now want to determine the frequencies, amplitudes, and phase angles of the sinusoidal components of g (£). 
Because g (£) produces the same response as the periodic wave p(£), it is reasonable to expect that g (£) has the 
same period T as p(£). This requires that each sinusoidal term in g (£) have period T. Consequently, the frequencies 


of the sinusoidal components must be integer multiples of the basic frequency 1 / T of the function p(£). Thus, the 
wg in Equation 2 must be of the form 


wk = 28и | T, k= 1,2, ... 


But because the ear cannot perceive sinusoidal waves with frequencies greater than 20,000 cps, we may omit those 
values of k for which шр, / 27 = А / T is greater than 20,000. Thus, g (f) is of the form 


. í 2s с . f 2nmt z 
g(t) = Ao + А si (28. — 51) 4.4 Ay sin{ 2ш — ap) () 
where n is the largest integer such that з / Т is not greater than 20,000. 


We now turn our attention to the values of the amplitudes Ag, Aj, ..., An and the phase angles $1, 49, ..., бу, that 
appear in Equation 3. There is some criterion by which the auditory system “picks” these values so that g (£) 
produces the same response as p(£). To examine this criterion, let us set 


e(t) = p(t) – a(t) 
If we consider g (2) as an approximation to p(£), then e(£) is the error in this approximation, an error that the ear 


cannot perceive. In terms of e(£), the criterion for the determination of the amplitudes and the phase angles is that 
the quantity 


T T 
/ ec ai | PORTONE? (4) 
0 0 


be as small as possible. We cannot go into the physiological reasons for this, but we note that this expression is 
proportional to the acoustic energy of the error wave e(£) over one period. In other words, it is the energy of the 
difference between the two sound waves p(£) and g(£) that determines whether the ear perceives any difference 
between them. If this energy is as small as possible, then the two waves produce the same sensation of sound. 
Mathematically, the function 4 (4) in 4 is the least squares approximation to p(£) from the vector space C[0, 7] of 
continuous functions on the interval [0, 7]. (See Section 6.6.) 


Least squares approximations by continuous functions arise in a wide variety of engineering and scientific 
approximation problems. Apart from the acoustics problem just discussed, some other examples follow. 


1. Let S(x) be the axial strain distribution in a uniform rod lying along the x-axis from x = 0 to x =} (Figure 
10.19.4). The strain energy in the rod is proportional to the integral 


i 
/ [S(x]? dx 
0 


The closeness of an approximation g (x) to S(x) can be judged according to the strain energy of the difference 
of the two strain distributions. That energy is proportional to 


j 2 
/ [S(x) —9 (x) ]? dx 


which is a least squares criterion. 


2. Let E(£) be a periodic voltage across a resistor in an electrical circuit (Figure 10.19.5). The electrical energy 
transferred to the resistor during one period T is proportional to 


y 
i [EE]? а 
0 


If 4 (2) has the same period as E(£) and is to be an approximation to E(£), then the criterion of closeness might 
be taken as the energy of the difference voltage. This is proportional to 


P 2 
/ 200) —а(01]2 at 


which is again a least squares criterion. 


3. Let y (x) be the vertical displacement of a uniform flexible string whose equilibrium position is along the x-axis 
from x = 0 to x =} (Figure 10.19.6). The elastic potential energy of the string is proportional to 


; 2 
/ [>(х)]2ах 


If g(x} is to be an approximation to the displacement, then as before, the energy integral 


i 
/ D») —@(х)]24х 


determines a least squares criterion for the closeness of the approximation. 
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Figure 10.19.5 
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Figure 10.19.6 


Least squares approximation is also used in situations where there is no a priori justification for its use, such as for 
approximating business cycles, population growth curves, sales curves, and so forth. It is used in these cases 
because of its mathematical simplicity. In general, if no other error criterion is immediately apparent for an 
approximation problem, the least squares criterion is the one most often chosen. 


The following result was obtained in Section 6.6. 


THEOREM 10.19.1 Minimizing Mean Square Error on [0, 2тт] 
If f (2) is continuous on [0, 27], then the trigonometric function g(£) of the form 
g(t) = lao + a1 cos £ +... ау cos ané + b, sin £ 4...4 - by sin zé 
that minimizes the mean square error 
2T 
2 
[у-га 


has coefficients 


27 
ак x f (£)cos kt dt, K=0,1,2,...2 
a 0 


27 
by i f(t)sin kt dt, k=1,2,...7 
a 0 


If the original function f (2) is defined over the interval [0, 7] instead of [0, 27], a change of scale will yield the 
following result (see Exercise 8): 


THEOREM 10.19.2 Minimizing Mean Square Error on [0, 7] 


If f (£) is continuous on [0, 7], then the trigonometric function g(£) of the form 


а on 2AT "Pd ‚‚ 2AT 
g(t) = 240 Ба совт +... dy COS t +b] sint H... by sin і 
that minimizes the mean square error 
T 
[ o sor a 
has coefficients 
T 
_ 2 2knt = 
ак = zf J (£)cos E dí, k—0,1,2,..,2 
2 [^ pen Dont 
bp = 2 соса dt, села, с.н 


EXAMPLE 1 Least Squares Approximation to a Sound Wave + 


Let a sound wave p(£) have a saw-tooth pattern with a basic frequency of 5000 cps (Figure 10.19.7). 
Assume units are chosen so that the normal atmospheric pressure is at the zero level and the 
maximum amplitude of the wave is A. The basic period of the wave 15 T = 1 / 5000 = .0002 second. 
From ; = ( to ; = T, the function p(£) has the equation 


p= AE -1) 


Theorem 10.19.2 then yields the following (verify): 


T T 
_ 2 € rur TP Е 
a = 2 | p(t) 4-2 | ae 2 0 
T T 
4i 2kzt 21 2AIT kri 
a = m pocot а= 2. [ U (£ -ejos at =0, k—1,2,... 
T Јо T Th T AZ T 
T Y 
= 2 „2А , 2f 2A(T V 2km „_2А p 
bk = J pít)sn E dí al AE tin T dí ls k—1,2,... 


We can now investigate how the sound wave p(£) is perceived by the human ear. We note that 
41 T = 20,000 cps, so we need only go up to x = 4 in the formulas above. The least squares 
approximation to p(£) is then 


= 2A) 29, 1.147, 1.69, 1.89 
g(t)— = E 5st + zane + ув] 
The four sinusoidal terms have frequencies of 5000, 10,000, 15,000, and 20,000 cps, respectively. In 
Figure 10.19.8 we have plotted p (£) and g (£) over one period. Although g (£) is not a very good 
point-by-point approximation to p {£}, to the ear, both p(£) and 2 (£) produce the same sensation of 


sound. 


Figure 10.19.7 
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Figure 10.19.8 


As discussed in Section 6.6, the least squares approximation becomes better as the number of terms in the 
approximating trigonometric polynomial becomes larger. More precisely, 


2T и 2 
Í ro — lag- ЎЎ (ар cos kt + by sin z dt 
0 2 kzi 
tends to zero as n approaches infinity. We denote this by writing 


HOES lao + J (ак cos kt + by sin kt) 
k=l 


where the right side of this equation is the Fourier series of f (£). Whether the Fourier series of f (£) converges to 
f (£) for each t is another question, and a more difficult one. For most continuous functions encountered in 
applications, the Fourier series does indeed converge to its corresponding function for each value of t. 


Exercise Set 10.19 


1. Find the trigonometric polynomial of order 3 that is the least squares approximation to the function 
f@H=t- т)? over the interval [0, 27]. 


Answer: 


л? 4 
2—4 созі 4 cos 27 + —cos 3t 


3 3 
2. Find the trigonometric polynomial of order 4 that is the least squares approximation to the function f (£) = {2 


over the interval [0, 7]. 


Answer: 


ENG 2x АТ I TT ру T 


Ш 


Т^ d 2m, 1 .4m,, 1... 60, , 1 =) 


Т 2 T 3 T 4 T 


3. Find the trigonometric polynomial of order 4 that is the least squares approximation to the function f (£) over 


the interval [0, 27], where 
smné, 0</<+ях 
70) = MN 


- T^ [ia et 1 sin e + І за $7, 4 "LES 


0, TiS 27 
Answer: 
i ! 5 sint — 2 cos 2t — 42— cos 4t 
4. Find the trigonometric polynomial of arbitrary order n that is the least squares approximation to the function 
Jue sind over the interval [0, 27]. 
Answer: 
4/1 1 1 1 1 
(2 – —— —-——— cos 3Ó-..- t 
"ip ],j; 098“ 3.5 008 2t mg COS (n= DOns- D cos nt 


5. Find the trigonometric polynomial of arbitrary order л that is the least squares approximation to the function 
f (£) over the interval [0, 7], where 


t O<t< iT 
Ф = M 
=, 27< £T 
Answer 
T S8T|1 oat 1 бт? 1 10s 1 2nnt 
— = —-|—— cos SS te cos SF cos | os 
4 - 2 T 62 T 2 T (2)? T 


6. For the inner product 
(u, v} -=f “u(t)v(t) dt 
0 


show that 
(a) 1I = 2m 
(b) соз kt] = yr fork — 1,2,... 
(c) ||sin | = ут fork — 1,2, ... 
7. Show that the 25; + 1 functions 
1, cos £, cos 2£,..., cos n£, sin £, sin 2£, ..., sin wt 
are orthogonal over the interval [0, 2m] relative to the inner product (u, v) defined in Exercise 6. 


8. If 7 (£) is defined and continuous on the interval [0, 7], show that f (77 / 2m) is defined and continuous for т 
in the interval [0, 27]. Use this fact to show how Theorem 10.19.2 follows from Theorem 10.19.1. 


Section 10.19 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be MATLAB, 
Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra software or a 
scientific calculator with some linear algebra capabilities. For each exercise you will need to read the relevant 
documentation for the particular utility you are using. The goal of these exercises is to provide you with a basic 
proficiency with your technology utility. Once you have mastered the techniques in these exercises, you will be 
able to use your technology utility to solve many of the problems in the regular exercise sets. 


T1. Let g be the function 


_ 34-4 sin£ 
git) = 5—4 cost 


for 0 < £ < 27. Use a computer to determine the Fourier coefficients 


ак m! "(ied cos kt A 
bk Tjo i5-—4cos£ j| sin kt 


for k — 0, 1, 2, 3, 4, 5. From your results, make a conjecture about the general expressions for 4 and ;. Test your 
conjecture by calculating 


2 
on the computer and see whether it converges to g(£). 


T2. Let g be the function 


loo H >> (ak cos kt + by sin kt) 
k=l 


g(t) = г°®®![соз(вш £) + sin(sin £) ] 


for 0 = £ < 2m. Use a computer to determine the Fourier coefficients 


ак 1 дт cos kf 
=> É dt 
M | a Non 
for = 0, 1, 2, 3, 4, 5. From your results, make a conjecture about the general expressions for @% апа ġġ. Test your 


conjecture by calculating 


dap + У) (ак cos kt + by, sin kt) 
k=l 


on the computer and see whether it converges to g(t). 
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10.20 Warps апа Morphs 


Among the more interesting image-manipulation techniques available for computer graphics are warps and 
morphs. In this section we show how linear transformations can be used to distort a single picture to produce 
a warp, or to distort and blend two pictures to produce a morph. 


Prerequisites 


Geometry of Linear Operators on 22 (Section 4.11) 


Linear Independence 


Bases in 22 


Computer graphics software enables you to manipulate an image in various ways, such as by scaling, rotating, 
or slanting the image. Distorting an image by separately moving the corners of a rectangle containing the 
image is another basic image-manipulation technique. Distorting various pieces of an image in different ways 
is amore complicated procedure that results in a warp of the picture. In addition, warping two different 
images in complementary ways and blending the warps results in a morph of the two pictures (from the Greek 
root meaning “shape” or *form"). An example is Figure 10.20.1 in which four photographs of a woman taken 
over a 50-year period (the four diagonal pictures from top left to bottom right) have been pairwise morphed 
by different amounts to suggest the gradual aging of the woman. 


Figure 10.20.1 


The most visible application of warping and morphing images has been the production of special effects in 
motion pictures and television. However, many scientific and technological applications of such techniques 
have also arisen—for example, studying the evolution, growth, and development of living organisms, 
assisting in reconstructive and cosmetic surgery, exploring various designs of a product, and “aging” 
photographs of missing persons or police suspects. 


Warps 
We begin by describing a simple warp of a triangular region in the plane. Let the three vertices of a triangle be 
given by the three noncollinear points V1, V2, and V3 (Figure 10.20.2a). We will call this triangle the begin- 


triangle. If v is any point in the begin-triangle, then there are unique constants сј and c2 such that 


v — v3 — c4(v4 — v3) + с2(У2 — v3) (1) 


Equation | expresses the vector V — v3 as a (unique) linear combination of the two linearly independent 
vectors Vj — V3 and V2 — V3 with respect to an origin at V3. If we set c3 = 1 — с = c5, then we can rewrite 1 
as 


V — суу] + €2V2 + €3V3 (2) 
where 
ci 7-02 +63 = 1 (3) 
from the definition of ^3. We say that v is a convex combination of ће vectors V1, V2, and V3 if 2 and 3 are 
satisfied and, in addition, the coefficients С |, 72, and ©з are nonnegative. It can be shown (Exercise 6) that v 


lies in the triangle determined by v1, v2, and v3 if and only if it is a convex combination of those three 
vectors. 


V = CIV + C3 V5 + СУ 


(a) 


У = { Mf +c 2% + CW, 


(5) 


Figure 10.20.2 


Next, given three noncollinear points ж], W2, and W3 of an end-triangle (Figure 10.20.25), there is a unique 
affine transformation that maps V4 to W1, V2 to W2, and V3 to W3. That is, there is a unique 2 м 2 invertible 
matrix M and a unique vector b such that 


w;— Му; + for?=1, 2,3 (4) 


(See Exercise 5 for the evaluation of M and b.) Moreover, it can be shown (Exercise 3) that the image w of the 
vector v in 2 under this affine transformation 15 


w= Сү] + C2W2 + C3W3 (5) 


This is a basic property of affine transformations: They map a convex combination of vectors to the same 
convex combination of the images of the vectors. 


Now suppose that the begin-triangle contains a picture within it (Figure 10.20.3a). That is, to each point in the 
begin-triangle we assign a gray level, say 0 for white and 100 for black, with any other gray level lying 
between 0 and 100. In particular, let a scalar-valued function pg, called the picture-density of the begin- 
triangle, be defined so that pg (wv) is the gray level at the point v in the begin-triangle. We can now define a 
picture in the end-triangle, called a warp of the original picture, with a picture-density p, by defining the gray 
level at the point w within the end-triangle to be the gray level of the point v in the begin-triangle that maps 
onto w. In equation form, the picture-density p, is determined by 


p1Cw) = po(c1v1 + c2v2 + c3v3) (6) 


In this way, as с], 72, and €3 vary over all nonnegative values that add to one, 5 generates all points w in the 
end-triangle, and 6 generates the gray levels p, (w) of the warped picture at those points (Figure 10.20.35). 


У= суу + ‹ IVa + C4V4 


(а) 


W3 


p, QW) = роу) 


W = сүү + Суу + Суз 


(b) 
Figure 10.20.3 


Equation 6 determines a very simple warp of a picture within a single triangle. More generally, we can break 
up a picture into many triangular regions and warp each triangular region differently. This gives us much 
freedom in designing a warp through our choice of triangular regions and how we change them. To this end, 
suppose we are given a picture contained within some rectangular region of the plane. We choose n points V1, 


V3, ..., Уң Within the rectangle, which we call vertex points, so that they fall on key elements or features of 
the picture we wish to warp (Figure 10.20.4a). Once the vertex points are chosen, we complete a 
triangulation of the rectangular region; that is, we draw line segments between the vertex points in such a 
way that we have the following conditions (Figure 10.20.45): 


1. The line segments form the sides of a set of triangles. 

2. The line segments do not intersect. 

3. Each vertex point is the vertex of at least one triangle. 

4. The union of the triangles is the rectangle. 

5. The set of triangles is maximal (1.e., no more vertices can be connected). 


Note that condition 4 requires that each corner of the rectangle containing the picture be a vertex point. 


M v2 
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ә LJ 
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y 4 
Ve V; 
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a SE ng 
V4 
V6 Ут 
(с) 
Figure 10.20.4 


One can always form a triangulation from any n vertex points, but the triangulation is not necessarily unique. 


For example, Figures 10.20.4Ь and 10.20.4c are two different triangulations of the set of vertex points in 
Figure 10.20.4a. Since there are various computer algorithms that perform triangulations very quickly, it is 
not necessary to perform the tiresome triangulation task by hand; one need only specify the desired vertex 
points and let a computer generate a triangulation from them. If n is the number of vertex points chosen, it can 
be shown that the number of triangles m of any triangulation of those points is given by 


m-—2n—-2-k (7) 


where k is the number of vertex points lying on the boundary of the rectangle, including the four situated at 
the corner points. 


The warp is specified by moving the n vertex points V1, v5, ..., Vy to new locations W1, W3, ..., Wy according 
to the changes we desire in the picture (Figures 10.20.5a and 10.20.55). However, we impose two restrictions 
on the movements of the vertex points: 


1. The four vertex points at the corners of the rectangle are to remain fixed, and any vertex point on a side of 
the rectangle is to remain fixed or move to another point on the same side of the rectangle. АП other vertex 
points are to remain in the interior of the rectangle. 


2. The triangles determined by the triangulation are not to overlap after their vertices have been moved. 


The first restriction guarantees that the rectangular shape of the begin-picture is preserved. The second 
restriction guarantees that the displaced vertex points still form a triangulation of the rectangle and that the 
new triangulation is similar to the original one. For example, Figure 10.20.5c is not an allowable movement 
of the vertex points shown in Figure 10.20.5a. Although a violation of this condition can be handled 
mathematically without too much additional effort, the resulting warps usually produce unnatural results and 
we will not consider them here. 


Ve v- We w- LP ws 
(a) (b) (с) 
Figure 10.20.5 


Figure 10.20.6 is a warp of a photograph of a woman using a triangulation with 94 vertex points and 179 
triangles. Note that the vertex points in the begin-triangulation are chosen to lie along key features of the 
picture (hairline, eyes, lips, etc.). These vertex points were moved to final positions corresponding to those 
same features in a picture of the woman taken 20 years after the begin-picture. Thus, the warped picture 
represents the woman forced into her older shape but using her younger gray levels. 


Begin-picture 


| Ss. hee 


Begin-triangulation Warped triangulation 


Figure 10.20.6 


Time-Varying Warps 


A time-varying warp 15 the set of warps generated when the vertex points of the begin-picture are moved 
continually in time from their original positions to specified final positions. This gives us a motion picture in 
which the begin-picture is continually warped to a final warp. Let us choose time units so that ¢ — 0 
corresponds to our begin-picture and ¢ = | corresponds to our final warp. The simplest way of moving the 
vertex points from time 0 to time 1 is with constant velocity along straight-line paths from their initial 


positions to their final positions. 


To describe such a motion, let ці (Є) denote the position of the ith vertex point at any time т between 0 апа 1. 
Thus (0) = v; (its given position in the begin-picture) and u;(1) = w; (its given position in the final warp). 
In between, we determine its position by 


uj(t) = (1—t)vj + twi (8) 


Note that 8 expresses uj(t) as a convex combination of vj and wj for each r in [0, 1]. Figure 10.20.7 
illustrates a time-varying triangulation of a plain rectangular region with six vertex points. The lines 
connecting the vertex points at the different times are the space-time paths of these vertex points in this 
space-time diagram. 


N 
VE 


Figure 10.20.7 


Once the positions of the vertex points are computed at time т, a warp is performed between the begin-picture 
and the triangulation at time t determined by the displaced vertex points at that time. Figure 10.20.8 shows a 
time-varying warp at five values of t generated from the warp between ғ — () and ¢ — 1 shown in Figure 
10.20.6. 


Figure 10.20.8 


Morphs 


A time-varying morph can be described as a blending of two time-varying warps of two different pictures 
using two triangulations that match corresponding features in the two pictures. One of the two pictures is 
designated as the begin-picture and the other as the end-picture. First, a time-varying warp from ¢ — () to 

# = 1 is generated in which the begin-picture is warped into the shape of the end-picture. Then a time-varying 
warp from ¢ = 1 to = 0 is generated in which the end-picture is warped into the shape of the begin-picture. 
Finally, a weighted average of the gray levels of the two warps at each time ¢ is produced to generate the 
morph of the two images at time t. 


Figure 10.20.9 shows two photographs of a woman taken 20 years apart. Below the pictures are two 
corresponding triangulations in which corresponding features of the two photographs are matched. The 
time-varying morph between these two pictures for five values of t between 0 and 1 is shown in Figure 
10.20.10. 


i = 
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Begin-triangulation End-triangulation 


Figure 10.20.9 


Figure 10.20.10 


The procedure for producing such a morph is outlined in the following nine steps (Figure 10.20.11): 


Step 1 Given a begin-picture with picture-density pj and an end-picture with picture-density ру, position n 
vertex points V1, v5, ..., Vy in the begin-picture at key features of that picture. 


Step 2 Position n corresponding vertex points W1, w^, ..., Wy, in the end-picture at the corresponding key 
features of that picture. 


Step 3 Triangulate the begin- and end-pictures in similar ways by drawing lines between corresponding 
vertex points in both pictures. 


Step 4 For any time г between 0 and 1, find the vertex points uj (#), u2(£), ..., uy (£) in the morph picture at 
that time, using the formula 


u;(£) = (1—2)vj + А, i= 1,2,...4 (9) 


Step 5 Triangulate the morph picture at time ¢ similar to the begin- and end-picture triangulations. 


Step 6 For any point u in the morph picture at time г, find the triangle in the triangulation of the morph 
picture in which it lies and the vertices uj(£), uy (£), and ug {źć) of that triangle. (See Exercise 1 to 
determine whether a given point lies in a given triangle.) 


Step 7 Express и as a convex combination of uj(£), и г (2), and ug (#) by finding the constants С], ¢ 7, and 
cg such that 


о = слі) +e uj(£) --cgugit) (10) 
and 
С]+=су+сү=1 (11) 
Step 8 Determine the locations of the point и in the begin- and end-pictures using 
ү=срүр суур сурук (inthe begin-picture) (12) 
апа 
w-—ciwj--cyw;--cgwg (nthe end-picture) (13) 
Step 9 Finally, determine the picture-density p;(u) of the morph-picture at the point и using 
pru) = (1 — £)po(v) + to) Cw) (14) 


Step 9 is the key step in distinguishing a warp from a morph. Equation 14 takes weighted averages of the gray 
levels of the begin- and end-pictures to produce the gray levels of the morph-picture. The weights depend on 
the fraction of the distances that the vertex points have moved from their beginning positions to their ending 
positions. For example, if the vertex points have moved one-fourth of the way to their destinations (i.e., if 

£ = 0.25), then we use one-fourth of the gray levels of the end-picture and three-fourths of the gray levels of 


the begin-picture. Thus, as time progresses, not only does the shape of the begin-picture gradually change into 
the shape of the end-picture (as in a warp) but the gray levels of the begin-picture also gradually change into 
the gray levels of the end-picture. 


Time = 1 
End-picture 
Given density: p,(w) 


Time = 

Morph-picture 

Computed density: 

рм) = (1 — Орбу) + tp (w) 


Time = 0 
Begin-picture 
Given density: polv) 


| 
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Figure 10.20.11 


The procedure described above to generate a morph is cumbersome to perform by hand, but it is the kind of 
dull, repetitive procedure at which computers excel. A successful morph demands good preparation and 
requires more artistic ability than mathematical ability. (The software designer is required to have the 
mathematical ability.) The two photographs to be morphed should be carefully chosen so that they have 
matching features, and the vertex points in the two photographs also should be carefully chosen so that the 
triangles in the two resulting triangulations contain similar features of the two pictures. When the procedure is 
done correctly, each frame of the morph should look just as "real" as the begin- and end-pictures. 


The techniques we have discussed in this section can be generalized in numerous ways to produce much more 
elaborate warps and morphs. For example: 


1. 


If the pictures are in color, the three components of the picture colors (red, green, and blue) can be 
morphed separately to produce a color morph. 


. Rather than following straight-line paths to their destinations, the vertices of a triangulation can be directed 


separately along more complicated paths to produce a variety of results. 


. Rather than travel with constant speeds along their paths, the vertices of a triangulation can be directed to 


have different speeds at different times. For example, in a morph between two faces, the hairline can be 
made to change first, then the nose, and so forth. 


. Similarly, the gray-level mixing of the begin-picture and end-picture at different times and different 


vertices can be varied in a more complicated way than that in Equation 14. 


. One can morph two surfaces in three-dimensional space (representing two complete heads, for example) 


by triangulating the surfaces and using the techniques in this section. 


6. Опе can morph two solids in three-dimensional space (for example, two three-dimensional tomographs of 
a beating human heart at two different times) by dividing the two solids into corresponding tetrahedral 
regions. 


7. Two film strips can be morphed frame by frame by different amounts between each pair of frames to 
produce a morphed film strip in which, say, an actor walking along a set is gradually morphed into an ape 
walking along the set. 


8. Instead of using straight lines to triangulate two pictures to be morphed, more complicated curves, such as 
spline curves, can be matched between the two pictures. 


9. Three or more pictures can be morphed together by generalizing the formulas given in this section. 


These and other generalizations have made warping and morphing two of the most active areas in computer 
graphics. 


Exercise Set 10.20 


1. Determine whether the vector v is a convex combination of the vectors V1, V2, and V3. Do this by solving 
Equations 1 and 3 for c1, с2, and ¢3 and ascertaining whether these coefficients are nonnegative. 


@) = |; "=[\| = »-p] 


АНСЫН НЕН 
© "= |0) "= il "= 132) "=f 
@ nl v= p "n= kil = H 


(a) Yes; v = Lv; | 2+, | 2y, 


5 5 5 
(b) No; v = vi + iv — 193 
(с) Yes; v = vi 4 iv + буз 
(d) Yes; v = т=з! + Ér, + im 


2. Verify Equation 7 for the two triangulations given in Figure 10.20.4. 
Answer: 


m = number of triangles = 7, » = number of vertex points = 7,  — number of boundary vertex points 
= 5; Equation 7) is 7 = 2(7) — 2 — 5. 

3. Let an affine transformation be given by a 2 x 2 matrix M and a two-dimensional vector b. Let 
V = суу] + €2V2 + С3У3, Where cy 4-02 --e3 = l; let w= Mv + b; and let w; = Mv; +b fori = 1,2,3. 
Show that w = cw + €2W2 + €3w3. (This shows that an affine transformation maps a convex 
combination of vectors to the same convex combination of the images of the vectors.) 


Answer: 


w= Mv +b = Meyvy --c2v3 +0393) + (01 Hea H c3)b 
=c (fv +b) +e Mv +b) --ca3( Mv35 +b) = сүи + c2w2 + c3w3 
4. (a) Exhibit a triangulation of the points in Figure 10.20.4 in which the points V3, V5, and ув form the 
vertices of a single triangle. 


(b) Exhibit a triangulation of the points in Figure 10.20.4 in which the points V2, V5, and v7 do not form 
the vertices of a single triangle. 


Answer: 
v Y^ 
(a) LL i 
Va 
Ve Mr] 
v Vo 
(b — 
M Уз 


5. Find the 2 x 2 matrix M and two-dimensional vector b that define the affine transformation that maps the 
three vectors V1, V2, and V3 to the three vectors м, W2, and W3. Do this by setting up a system of six 
linear equations for the four entries of the matrix M and the two entries of the vector b. 


s fof) bl 


Answer: 


(с) M 1 0 b= 2 
l| -3 

(d) lo d 
М=|2 , b=| 2 

2 0 —1 


6. (a) Let a and b be linearly independent vectors in the plane. Show that if c; and 72 are nonnegative 
numbers such that c, + c2 = 1, then the vector сүа + cb lies on the line segment connecting the tips 
of the vectors a and b. 


(b 


— 


Let a and b be linearly independent vectors in the plane. Show that if c; and 72 are nonnegative 
numbers such that c, + c2 < 1, then the vector сүа + c?b lies in the triangle connecting the origin 
and the tips of the vectors a and b. [Hint: First examine the vector сја + суз multiplied by the scale 
factor 1 / (су + c3).] 


(c) Let v1, V2, and V3 be noncollinear points in the plane. Show that 1f 71, 02, and ¢3 are nonnegative 
numbers such that c4 + c2 + єз = 1, then the vector c1¥1 + €2v2 + c3v3 lies in the triangle 
connecting the tips of the three vectors. [7int: Let a = V1 — v3 and = v5 — v3, and then use 
Equation 1 and part (b) of this exercise.] 


T. (a) What can you say about the coefficients c1, c2, and сз that determine a convex combination 
V = сууу + €2V2 + €3V3 if v lies on one of the three vertices of the triangle determined by the three 
vectors V1, V2, and V3? 


(b) What can you say about the coefficients C1, 02, and 73 that determine a convex combination 
V = суу] + €2V7 + €3V3 if v lies on one of the three sides of the triangle determined by the three 
vectors V1, V2, and V3? 


(c) What can you say about the coefficients C1, 72, and 73 that determine a convex combination 
V = суу] + €2V2 + C3V3 if v lies in the interior of the triangle determined by the three vectors V1, V2, 
and v3? 


Answer: 


(a) Two of the coefficients are zero. 
(b) At least one of the coefficients is zero. 


(c) None of the coefficients are zero. 


8. (a) The centroid of a triangle lies on the line segment connecting any one of the three vertices of the 
triangle with the midpoint of the opposite side. Its location on this line segment is two-thirds of the 
distance from the vertex. If the three vertices are given by the vectors V1, V2, and V3, write the 
centroid as a convex combination of these three vectors. 


(b) Use your result in part (a) to find the vector defining the centroid of the triangle with the three vertices 


ВЕЕ 


Answer: 


(a) iu | iv | 193 


b) | 8/3 
Bl А | 
Section 10.20 Technology Exercises 


The following exercises are designed to be solved using a technology utility. Typically, this will be 
MATLAB, Mathematica, Maple, Derive, or Mathcad, but it may also be some other type of linear algebra 
software or a scientific calculator with some linear algebra capabilities. For each exercise you will need to 
read the relevant documentation for the particular utility you are using. The goal of these exercises is to 
provide you with a basic proficiency with your technology utility. Once you have mastered the techniques in 
these exercises, you will be able to use your technology utility to solve many of the problems in the regular 
exercise sets. 


Vil 
T1. To warp or morph a surface іп R? we must be able to triangulate the surface. Let vy = | V12 |, 
Y13 
721 У31 v1 
уз = | 722 |, and v5 = | У32 | be three noncollinear vectors on the surface. Then a vector v = | V2 | lies in the 
Y23 Y33 УЗ 


triangle formed by these three vectors if and only if v is a convex combination of the three vectors; that is, 
V = су] + €2V2 + €3V3 for some nonnegative coefficients С 1, 02, and ¢3 whose sum is 1. 


(a) Show that in this case, С], 72, and ©з are solutions of the following linear system: 


Vil У21 УЗ] c] Y1 
У12 V22 V32 с | v2 
У1З V23 V33 сз УЗ 
1 1 4 1 
2 
In parts (b)-(d) determine whether the vector v is a convex combination of the vectors vj =| 7 |, 
—5 
3 2 
v;—|0landv3—| 2 |. 
9 —4 
6) А 
= 9 
9 
©, Е 
reg 9 
9 


у= 1 —] 
50 
T2. To warp or morph a solid object in 23 we first partition the object into disjoint tetrahedrons. Let 
Vil 721 731 Va 
v1 = | ¥12 |, ¥2 =| У22 |, v3— | 32 |, and v4— | V42 | be four noncoplanar vectors. Then a vector 
Y13 v33 ¥33 uc 


v =| У2 | lies in the solid tetrahedron formed by these four vectors if and only if v is a convex combination of 
Y3 

the three vectors; that is, v = суу] + €2V2 + €3V5 + €4V4 for some nonnegative coefficients C1, C2, C3, and 

С 4 whose sum is one. 


(a) Show that in this case, С], 72, 03, and сд are solutions of the following linear system: 


ҮП У21 У31 У41 || 41 Y1 
У12 V22 V32 У42 ||e2| |Y2 
v13 У23 V33 V43||63| [Уз 
1 1 1 1 4 1 
2 
In parts (b)-(d) determine whether the vector v is a convex combination of the vectors v? = | —6 |, 
1 
—3 T —1 
v;—| 4|,va5—|2|,andv4—| 3 |. 
2 3 2 
(b) 5 
v—|0 
7 
(c) 1 
v—|1 
2 
(d) 1 
v—|2 
2 
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і APPENDIX Ё Jj 


How to Read Theorems 


Since many of the most important concepts in linear algebra occur as theorem statements, it is important to be 
familiar with the various ways in which theorems can be structured. This appendix will help you to do that. 


Contrapositive Form of a Theorem 


The simplest theorems are of the form 


If His true, then C is true. (1) 


where H is a statement, called the hypothesis, and C is a statement, called the conclusion. The theorem is true 
if the conclusion is true whenever the hypothesis is true, and the theorem is false if there is some case where 
the hypothesis is true but the conclusion is false. It is common to denote a theorem of form 1 as 
H=C (2) 
(read, “Н implies C"). As an example, the theorem 
If a and b are both positive numbers, then ab is a positive number. (3) 


is of form 2, where 


H — a and b are both positive numbers (4) 


C = ab is a positive number (5) 


Sometimes it is desirable to phrase theorems in a negative way. For example, the theorem in 3 can be 
rephrased equivalently as 


If abis not a positive number, then a and b are not both positive numbers. (6) 


If we write a 5 to mean that 4 is false and „„ ¢ to mean that 5 is false, then the structure of the theorem in 6 
is 


„С => ~H 


In general, any theorem of form 2 can be rephrased in form 7, which is called the contrapositive of 2. If a 


theorem is true, then so is its contrapositive, and vice versa. 


Converse of a Theorem 


(7) 


The converse of a theorem is the statement that results when the hypothesis and conclusion are interchanged. 


Thus, the converse of the theorem H — C is the statement (* — H. Whereas the contrapositive of a true 


theorem must itself be a true theorem, the converse of a true theorem may or may not be true. For example, 


the converse of 3 is the false statement 
If ab is a positive number, then a and b are both positive numbers. 


but the converse of the true theorem 
If a >b, then 2a 2b. 
is the true theorem 


If 2a > 20, then a > b. 


Equivalent Statements 


If a theorem ¥ = C and its converse С' — ¥ are both true, then we say that H and C are equivalent 
statements, which we denote by writing 


HSC 


(read, “H and C are equivalent”). There are various ways of phrasing equivalent statements as a single 
theorem. Here are three ways in which 8 and 9 can be combined into a single theorem. 


Form 1 


If с> b, then 2g => 2b, and conversely, if 2g с> 2b, then дъ b. 


(8) 


(9) 


(10) 


Еогт 2 


a > b if and only if 2a > 2b. 


Form 3 


The following statements are equivalent. 
(i) 275 
(ii) 2a > 2b 


Theorems Involving Three or More Statements 


Sometimes two true theorems will give you a third true theorem for free. Specifically, if Ej = (* is a true 
theorem, and С' = D 1s a true theorem, then Б — D must also be a true theorem. For example, the theorems 


If opposite sides of a quadrilateral are parallel, then the quadrilateral is a parallelo gram. 


and 
Opposite sides of a parallelogram have equal lengths. 


imply the third theorem 
If opposite sides of a quadrilateral are parallel, then they have equal lengths. 


Sometimes three theorems yield equivalent statements for free. For example, if 

H=C, С= р, DSH (11) 
then we have the implication loop in Figure A.1 from which we can conclude that 

С=Н, Ю=<С, HD (12) 
Combining this with 11 we obtain 

HeC Ce.) DeH (13) 


In summary, if you want to prove the three equivalences in 13, you need only prove the three implications in 
11. 


ре == э*с 


Figure A.1 
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і APPENDIX Ё [| 


Complex Numbers 


Complex numbers arise naturally in the course of solving polynomial equations. For example, the solutions of 
the quadratic equation ах? b bx +c = 0, which are given by the quadratic formula 


ШЕЕ: үз? — 4ac 


i 2a 
are complex numbers if the expression inside the radical is negative. In this appendix we will review some of 
the basic ideas about complex numbers that are used in this text. 


Complex Numbers 


To deal with the problem that the equation x2 — — | has no real solutions, mathematicians of the eighteenth 


i= yt 
2 (у) = -1 


but which otherwise has the algebraic properties of a real number. An expression of ће form 


century invented the “imaginary” number 


which is assumed to have the property 


a+bi or a4 ib 
in which a and b are real numbers is called a complex number. Sometimes it will be convenient to use a 
single letter, typically z, to denote a complex number, in which case we write 
z—a-bi or 2=а +1р 
The number a is called the real part of z and is denoted by Re(z), and the number b is called the imaginary 
part of z and is denoted by Im(z). Thus, 


Re(3 + 2i) —3, Im(3 + 2i) 22 
Re(1 — 5i) 2 1, In(1—5i) 2Im(14-(—5)) = —5 
Re(7i) 2 Re(0--7j) 20, Im(7i) 27 

Re(4) —4, Im(4) = Im(4 + 0i) =0 


Two complex numbers are considered equal if and only if their real parts are equal and their imaginary parts 
are equal; that is, 


a+bi=e+di ifandonyif a —candb—gd 


A complex number z — 5; whose real part is zero is said to be pure imaginary. A complex number z — g 
whose imaginary part is zero is a real number, so the real numbers can be viewed as a subset of the complex 


numbers. 


Complex numbers are added, subtracted, and multiplied in accordance with the standard rules of algebra but 
with j? — — 1: 


(a+ bi) + (c -- di) = (a -- c) + (b --d)i (1) 

(a + bi) — (c + di) = (a =c) + (b —d)i (2) 

(a + bi)(c + di) = (ac — bd) + (ad + bc)i (3) 
The multiplication formula is obtained by expanding the left side and using the fact that ;? — — 1. Also note 


that if b = Q, then the multiplication formula simplifies to 
a(c + di) —ac + adi (4) 


The set of complex numbers with these operations is commonly denoted by the symbol C and is called the 
complex number system. 


EXAMPLE 1 Multiplying Complex Numbers + 


As a practical matter, it is usually more convenient to compute products of complex numbers by 
expansion, rather than substituting in 3. For example, 


(3 — 2i) (4 + 5i) = 12 + 15i = 8i — 1042 = (12 + 10) + 7i = 22 + 7i 


The Complex Plane 


A complex number z = а + bi can be associated with the ordered pair (a, 5) of real numbers and represented 
geometrically by a point or a vector in the xy-plane (Figure B.1). We call this the complex plane. Points on 
the x-axis have an imaginary part of zero and hence correspond to real numbers, whereas points on the y-axis 
have a real part of zero and correspond to pure imaginary numbers. Accordingly, we call the x-axis the real 
axis and the y-axis the imaginary axis (Figure B.2). 


а+ы 


Figure B.1 


Imaginary axis 


(Imaginary b 
part of z) 


Real axis 
a 

(Real part of z) 

Figure B.2 


Complex numbers can be added, subtracted, or multiplied by real numbers geometrically by performing these 
operations on their associated vectors (Figure B.3, for example). In this sense the complex number system C 
is closely related to 22, the main difference being that complex numbers can be multiplied to produce other 


complex numbers, whereas there is no multiplication operation on 27 that produces other vectors іп 22 (the 
dot product produces a scalar, not a vector іп 22). 


| The sum of two complex | | The difference of two 
| numbers | complex numbers 


Figure B.3 


If z = а + biis a complex number, then the complex conjugate of z, or more simply, the conjugate of z, is 
denoted by Z (read, “z bar") and is defined by 


z—a-hbi (5) 


Numerically, z is obtained from z by reversing the sign of the imaginary part, and geometrically it is obtained 
by reflecting the vector for z about the real axis (Figure B.4). 


Figure В.4 


EXAMPLE 2 Some Complex Conjugates + 


2=3 +41 2=3— 41 
z= —2—5i Z= —2-45i 
zzi = =i 
ze] gr 


Remark The last computation in this example illustrates the fact that a real number is equal to its complex 
conjugate. More generally, z = 2 if and only if z is a real number. 


The following computation shows that the product of a complex number z = а + bi and its conjugate 
z = g = рі is a nonnegative real number: 


= (a+ bi) (a — bi) — a? — abi + bai — b^? = a? +b? (6) 


You will recognize that 


Vz- Va? +o? 


is the length of the vector corresponding to z (Figure B.5); we call this length the modulus (or absolute value 


of z) and denote it by Z|. Thus, 
= Z= a? 4 2? (7) 
Я = Ж 


number is ће same as its absolute value as defined in beginning algebra. 


Note that if 5 = 0, then z = g is a real number and Е а |, which tells us that the modulus of a real 


Figure B.5 


EXAMPLE 3 Some Modulus Computations + 


z—3 44i b|=¥3?7442=5 
z=i = о! +12=1 


Reciprocals and Division 
If z + 0, then the reciprocal (or multiplicative inverse) of z is denoted by 1 / 2 (or z 1) and is defined by the 


property 
n 
2 


This equation has a unique solution for 1 / z, which we can obtain by multiplying both sides by Z and using 
the fact that zz — ЕР [see 7]. This yields 


lof. (8) 
2 2 
| 
If zz # 0, then the quotient z; / zz is defined to be the product of 21 and 1 / 23. This yields the formula 
2|__22_„ _ 2122. 
з p pal i 


Observe that the expression on the right side of 9 results if the numerator and denominator of z, / 27 are 
multiplied by 22. As a practical matter, this is often the best way to perform divisions of complex numbers. 


EXAMPLE 4 Division of Complex Numbers + 
Letz, = 3 + 4i and z; = 1 — 2i. Express 2 / z3 in the form д + bi. 


Solution We will multiply the numerator and denominator of z4 / z3 by 22. This yields 
2] _ 217; 2+4 , 1+2 
22 2222. 1—2 1+42 
2034 6i -- 4i + 8i? 
1—4? 
=5 + 10; 
5 


—-1-42i 


The following theorems list some useful properties of the modulus and conjugate operations. 


THEOREM B.1 


The following results hold for any complex numbers z, 21, and 22. 
(a) 21 +22 =21 +22 

(b) 21—22 =21 — 22 

(с) 2122 = 2122 

(d) 21122 54 125 


к= 


THEOREM В.2 


The following results hold for any complex numbers z, 21, and 22. 
(а) FI= ВІ 

(b) E122] = 1122] 

(с) E1'z2| 7 Fal? Ea] 

(а) Fit zal S Kil + Fal 


Polar Form of a Complex Number 


If z = а + biis a nonzero complex number, and if ó is an angle from the real axis to the vector z, then, as 
suggested in Figure B.6, the real and imaginary parts of z can be expressed as 


a=|cos@ and b= p|smnó (10) 
Thus, the complex number z = д + bi can be expressed as 
z= E|(cos ġ +i sin ф) (11) 
which is called a polar form of z. The angle ọ in this formula is called an argument of z. The argument of z is 
not unique because we can add or subtract any multiple of 27 to it to obtain a different argument of z. 
However, there is only one argument whose radian measure satisfies 


=п<ф& тп (12) 


This is called the principal argument of z. 


a = |а| cos à 


Figure B.6 


EXAMPLE 5 Polar Form of a Complex Number + 
Expressz — 1 — үз in polar form using the principal argument. 


Solution The modulus of z is 


{= tec!» = 4 =2 


Thus, it follows from 10 with g = 1 and b = — үз that 
1=2cos@ and — (3 =2sind 


and this implies that 
cos @ = 1 and sing= _ УЗ 
2 2 
The unique angle ó that satisfies these equations and whose radian measure satisfies 12 is 
ф= = т! 3 (Figure B.7). Thus, a polar form of z is 


а-на) e apod — sin 


Figure В.7 


Geometric Interpretation of Multiplication and Division of Complex 
Numbers 


We now show how polar forms of complex numbers provide geometric interpretations of multiplication and 
division. Let 
zı = Ei|(cos фі +isingdy) and 22= Ea|(cos 62 +i sin $7) 
be polar forms of the nonzero complex numbers 21 and 22. Multiplying, we obtain 
2122 = E1|E2|[(cos $1cos ф2 — sin dj sin $2) +7 (sin $1cos $2 + cos sin $2) | 

Now applying the trigonometric identities 

соз(фі + 92) = cos d1cos $2 — sin gj sin ф2 

sin(@, + 2) = sin djcos @2 + cos Pisin 2 


yields 
2122 = E1|E2|[cos(ó1 + $2) +i sinio, + 2)] (13) 


which is a polar form of the complex number with modulus |; |Е2| and argument фу + фз. Thus, we have 
shown that multiplying two complex numbers has the geometric effect of multiplying their moduli and adding 
their arguments (Figure B.8). 


Figure B.8 


Similar kinds of computations show that 


Е Е Eil leos(on — $2) +i sin($1 — ф2)] 


which tells us that dividing complex numbers has the geometric effect of dividing their moduli and subtracting 
their arguments (both in the appropriate order). 


EXAMPLE 6 Multiplying and Dividing in Polar Form + 


Use polar forms of the complex numbers 2 = 1 4 үз: and z3 = үз Ьі to compute 2122 and 
21123. 


Solution Polar forms of these complex numbers аге 
21 = 2(соз5 +i sing | and 23 = 2 (cost +i sinz ) 
(verify). Thus, it follows from 13 that 


2123 =4|соз{® + e| +i sin (3 + ral =4| cos(F +i sin (>) | = 43 
and from 14 that 


Bhat 8) еа ]-= + - +} 


6 


As a check, let us calculate 2122 and z4 / z3 directly: 

2\22 = (1+ 33) (3 +i) = f3 i - 3i - үз? — 4i 

д_1+үз 1+3. y3-i _ үз—ї+ = pi? pipes EE А 1, 
22 ТЕЕ і үз Fi y3-i 3-12 


which agrees with the results obtained using polar forms. 


Remark The complex number i has a modulus of 1 and a principal argument of з / 2. Thus, if z is a complex 
number, then jz has the same modulus as z but its argument is greater by m / 2( = 90°); that is, multiplication 
by i has the geometric effect of rotating the vector z counterclockwise by 90? (Figure B.9). 


Figure B.9 


DeMoivre's Formula 


If п is a positive integer, and if z is a nonzero complex number with polar form 
z= E|(cos ġ + sin à) 


then raising z to the nth power yields 


ae II TEM ee z= E|" [cos(ó + +: +A] 4 i[sn(ó --O -- - - - +¢)] 
n factors nierms n terms 
which we can write more succinctly as 
n n , ‚ , 
z = E[ (cos иф + i sin иф) (15) 


In the special case where [z| — 1 this formula simplifies to 
z” = cos nó +i sin иф 


which, using the polar form for z, becomes 
(cos à +i sin d)" = cos nd +i sin иф (16) 


This result is called DeMoivre's formula. 


Euler's Formula 


If 0 is a real number, say the radian measure of some angle, then the complex exponential function g*® is 
defined to be 

e” — cos +i sin B (17) 
which is sometimes called Euler's formula. One motivation for this formula comes from the Maclaurin series 


in calculus. Readers who have studied infinite series in calculus can deduce 17 by formally substituting ;@ for 
x in ће Maclaurin series for г and writing 


m2 T. "A "T. 

0 уа 9, GA) (i8) 00) (i8) (i8) 

ё =1+10+ 2| * | * | * s * & te 
22 8 9.0 ø 66 
=1 +- 57-057 + аг г ег + 


2 4 6 3 5 


= cos D +i sin 


where the last step follows from the Maclaurin series for cos f) and sin @. 
If z= а + biis any complex number, then the complex exponential e? is defined to be 

e7 =e" 8i — 210 =? (cos b Fisnb) (18) 
It can be proved that complex exponentials satisfy the standard laws of exponents. Thus, for example, 


21 = 
122 21402 2 21-2 1,2 


=е 
2 2 
22 e 
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Answer to Exercises 


Exercise Set 1.1 
‚ (a), (c), and (f) are linear equations; (b), (d) and (e) are not linear equations 
‚ (a) and (d) are linear systems; (b) and (c) are not linear systems 


1 

3 

5. (a) and (d) are both consistent 

7. (a), (d), and (e) are solutions; (b) and (c) are not solutions 
9 


у = é 
ta = pirit 
х) = r 
X3 = 68 
ха = É 
11. a. 2x1 0 
3x; = 4x2 = 0 
x2 = 1 
b. 3x1 = 2x3 = 5 
7x, + x23 + 4х3 = =3 
—2x2 + x3 = 7 
c. 7x1 + 2x2 4 хз = 3x4 = 5 
хр + 2x2 + 4x3 = 1 
d. X1 = T 
х2 = —2 
х3 = 3 
ха = 
13. ac 6 
3 8 
9 -3 
b. Ё -1 3 | 
0 5-11 
| 0.2 0-3 1 0 
-3 –1 1 0 0 -1 
6 2-1 2-3 6 
a [1 0 0 0 —1 7] 


True/False 1.1 
(a) True 

(b) False 

(c) True 
(d) True 
(e) False 
(f) False 
(g) True 
(h) False 


Exercise Set 1.2 


1. . Both 


a 

b. Both 
. Both 
. Both 
. Both 
f. Both 


в. Row echelon 


ae 


e 


3. а xp —37, х= —8, x3=5 
b. x1 = 13: 10, x3— 13£ —5, x3= =£ +2, x4—t 
c, X4 — —7s--2t — ll, x3— 8, хз= —3t—4, x4— —3t--9, x5—t 


d. Inconsistent 


5. X1—23, x22 1, х3= 2 


7. x2 t—1l, y 22s, z 2s, w—t 
9, X1 —3, x21, х3= 2 


П. х=#- 1, у= 258, 2=5, = 
13. Has nontrivial solutions 

15. Has nontrivial solutions 

17. X1 20, x22 0, x3=0 


19, Xj = =5, x2— —t—s, x3—4s, x4—t 


2l. w—t,x—-t,y-tz-Ü 
23. П= — 1, 72-0, B=], 14-2 


25. If a = 4, there are infinitely many solutions; if g = — 4, there are no solutions; if g -- 4, there is exactly one solution. 
27. If 4 = 3, there are infinitely many solutions; if g = — 3, there are no solutions; if g + + 3, there is exactly one solution. 
29.,—2a2 48 , а 2b 

d a dpi eo 
5 E | and [ il are possible answers. 


35. х= +1, y= +93, z= +2 
зт. 421, b= —6,с—2,4=10 


39. The nonhomogeneous system will have exactly one solution. 


True/False 1.2 
(a) True 

(b) False 

(с) False 

(d) True 

(e) True 

(f) False 

(g) True 

(h) False 

(i) False 
Exercise Set 1.3 
1. 


а. Undefined 
b. 4x2 
c. Undefined 
d. Undefined 
e. 5x5 
{5х2 
в. Undefined 
h. 5х2 
з a[765 
—2 13 
737 
b.|-5 4 -1 
0 —-1 -1 
—1 1 1 
с 15 0 
—5 10 
5 5 
d. | —7 —28 —14 
=21 —7 —35 
е. Undefined 
f. | 22 —6 8 
-2 46 
10 04 
g.|-39 —21 —24 
9 —6 —15 


—33 —12 —30 


п. 


13. 


К. 168 
1. Undefined 


а. | 12 -3 
-4 5 
4 1 
b. Undefined 
с. |42 108 75 
12 —3 21 
36 78 63 
d.| 3 45 9 
11 -11 17 
7 17 13 
e | 3 45 9 
11 -11 17 
7 17 13 
f. {21 17 
17 35 
g[0-211 
12 1 8 
h. | 12 6 9 
48 —20 14 
24 8 16 
i. 61 
j. 35 
k. 28 
. 99 
а. [67 4141] 
p. [63 67 57] 
c. | 41 
21 
67 
d.| 6 
6 
63 
e. [24 56 97] 
f. |76 
98 
97 
а. | =3 3 —2 12 3 —2 7 76 3 —2 7 
48 [=3|6|+6| 5[ |29|—2—2|6|-5| 5|+4|4|, |98|—7|6|--4| 5|+9|4 
24 0 4 56 0 4 9 97 0 4 9 
b. |64 6 4 14 6 —2 4 38 6 —2 4 
21|=6)0|+7|3|; [22| = = 2|0 |+ 11+ 73 18|24|0|--3| 1|+5|3 
77 7 5 28 7 7 5 74 7 7 5 
а. |2 —3 5 |[х1 7 
9 —1 1[х2 |= | –1 
1 5 4), %3 0 
b.|4 0 -3 1[[ х 1 
5 1 0-8|72|] |3 
2-5 9 -1||[*33| |0 
0 3-1 37]|7^44 2 
а. 5x1 + 6x2 = 7x3 2 
=x] — 2x3 + 3x3 = 0 
4x; — x3 = 3 


b. Х| + х2 + хз = 2 


2x1 + 3x2 = 2 
5x; = 3x2 — 6x3 = —9 
15. —1 
17. 4=4, b= —6,с=—1,4=1 
23. [ag 0 0 0 0 0 
0 an 0 0 0 0 
0 0 a3 0 0 0 
0 0 0 ay 0 0 
0 0 0 0 as 0 
0 0 0 0 agg 
b. [аш @12 413 414 915 а6 
0 аз an ад 425 аж 
0 0 азз азд аз аз 
0 0 0 ад ад ад 
0 0 0 0 ass ase 
0 0 0 о 0 ag 
c. 411 0 0 0 0 0 
an an 0 0 0 0 
аз аз аз 0 0 0 
ад ад аз ay 0 0 
451 452 453 аѕд аз; 0 
аб 462 абз абд 465 а66 
d. јарар 0 0 0 0 
ал an аз 0 0 0 
0 азу аз az 0 0 
0 0 ад asy as 0 
0 0 0 аза ass аҳ 
0 0 0 0 ав ав 
x xi +x 
25. г) = ( p a) 


27. 1 1 0 
One; namely, 4= |1 —1 0 
0 0 


True/False 1.3 
(a) True 
(b) False 
(c) False 
(d) False 
(e) True 
(f) False 
(g) False 
(h) True 
(i) True 
(j) True 
(k) True 
(1) False 
(m) True 
(n) True 
(о) False 


Exercise Set 1.4 


17. 


f.|39 13 
26 13 
21. 4.4 [27 0 0 
0 26 —18 
0 18 26 
b. [1 
2 0 0 
0 0026 0.018 
0 —0018 0.026 
с. |4 0 0 
0 —5 —12 
0 12 —5 
а. | 1 0-0 
0-3 3 
0-3 -3 
e. | 16 0 0 
0 —14 —15 
0 15 -14 
£|25 0 0 
0 32 —24 
0 24 32 
27.[_1 0 0 
ап 
0. ois 0 
an 
0 0 H 
ayn 
E 
31. D— CA 18-1 4"gc? p") a 
33. Bo 
35. Ж. Ёш: 
2 2 2 
-1 l 1 1 
Hee 3 Ug 
l 1 1 
2.2.2 
37. l1. 1 
22 2 
Ае i11 1 
a ie 2 
10 0 
38 1 Q5 
m= 95" ш: 
А Е ааб 
Ет 
True/False 1.4 
(а) False 
(b) False 
(c) False 
(d) False 
(e) False 
(f) True 
(g) True 
(h) True 
(i) False 
() True 
(k) False 


Exercise Set 1.5 


1. а. Elementary 


b. Not elementary 
c. Not elementary 


d. Not elementary 


3. 
a- Add3 times row 2 to row 1: Е i 
» -loo 
Multiply row 1 by -7 010 
001 
с 100 
Add 5 times row 1 torow3:|0 1 0 
501 
d. 0010 
0100 
Swap rows | and 3: 1000 
0001 
5 а i 3 —6 —6 —6 
Swaprows Land2:84=| 7 —2 5 =] 
b. 2-1 0 -4 -4 
Add —3 times row 2 torow3:#A=| 1 —3 -1 5 3 
=1 9 4 —12 -10 
с 13 28 
Add 4 times row 3 to row 1: £A—| 2 5 
3 6 
7. а.|0 0 1 
010 
100 
b. {0 0 1 
010 
100 
c. 100 
010 
—2 01 
d.|10 0 
010 
201 
9,|-7 4 
2 =l 
TE [2-2 
7 7 
31 
7 7 
ik[ 3- Э 6 
2 10 5 
-1 1 1 
a 3. @ 
2 10 5 
15. No inverse 
17. Ae Ad 1 
2 2 2 
Zo d d 
2 2 2 
l 1,1 
2 2 2 
19. [| 7 
2 
-1 
0 


п[ і 1 
а 3 3 8 
1 1 _3 
“8 4 2 9 
1 
0 o 4 o 
AL Qd. Qd. Ql 
40 720 10 ~5 
3.|_7 5 5 _1 
12 24 #74 
5 5 1.1 
6 12 4 2 
2. 3. 1 
12 24 8 "4 
mde do cb E 
12 724 ^8 4 
25. а. [| 1 
ооо 
aie 
тоо 
X 
оодо 
E 
ооо 
ь. [1 a 
1-10 o 
отоо 
1 _1 
o o 2-1 
0 0 0 1 
27. c€0,1 
29. [23 1]. [1 oli 1][-4 0][1 0 
22| |02]0 1] 0 111 1 
з. [10 -2] [10 -2][1 0 offi 0 0 
04 з|=|01 olo 13ļ040 
00 1| Joo 1]0o0 1|o0 0 1 
эз.[_1 1 
48 [10]-1op —1]|! © 
13| |-11 sallo 110 4 
4 8 
3.[1 0 2] [1 00], о оох 
o 4 -ż|=[o 4 offo 1 -3]/0 1 0 
00 1[|001 
оо 1| |o 0 1 


37. Add — 1 times the first row to the second row. Add —] times the first row to the third row. Add — 1 times the second row to the first row. Add the second row to 
the third row. 

True/False 1.5 

(a) False 

(b) True 

(c) True 

(d) True 

(e) True 

(f) True 

(g) False 


Exercise Set 1.6 


1.23123, x22 -1 
3, x4 — = l, x3—4, x3— —7 
5.x=1, y=5,z=-1 
7. xj = 2b; —5b5, x2— = b] 43b 
5. ch 22 = 1 
amp RS] 
i. ,, 221 „= 1 
os aa dae т 


п. i a d. 
51-15: 72715 
34 28 
Hu. Р 9 
M qs 2-5 
il —19.... 1% 
51715: 72715 
i 1 3 
iv. =a =2 
X1 5' X2 5 
13. No conditions on b, and 5; 
15. ёз = b — b2 
17. b1 = ba + 24, Ёз = 2b3 + д 
19. 11 12 -3 27 26 
=| -6 -8 1 -18 -17 
-15 -21 9 -38 —35 
True/False 1.6 
(a) True 
(b) True 
(c) True 
(d) True 
(е) True 
(f) True 
(g) True 
Exercise Set 1.7 
л |. T 
2 0 
1 
io 
2.]-1 00 
1 
0 5 0 
0 0 3 
5.|6 3 
4 -1 
4 10 
7. | 215 10 0 20 -20 


35. 


. Not symmetric 
. Symmetric 

. Not symmetric 
. Not symmetric 
. Not invertible 
‚а= – 8 
_x#1, —2,4 


1 0 0 
0 —1 0 
0 0 -1 
a. Yes 
b. No (unless » — 1) 


c. Yes 


a 


. No (unless » = 1) 


1 0 
. 10 -2 k 1 0 
А? = .A?- Ara 
[ sl |: | 1/(-2)* 


п. 


25 0 0 
0 3* 0 
0 0 4* 


43. 


True/False 1.7 
(a) True 

(b) False 

(c) False 

(d) True 

(e) True 

(f) False 

(g) False 

(h) True 

(i) True 

(j) False 

(k) False 

(I) False 

(m) True 
Exercise Set 1.8 

1. 50 


30 60 


10 50 


40 
3. a. X3—X4-— —500, —x,--x4— 100, xy = х2 = 300, x2 = х3 = 100 
b. X1 — — 100 ++, x3— —400--£, x3= —500--£, xa—t 


c. For all rates to be nonnegative, we need ¢ = 500 cars per hour, so x; = 400, x5 = 100, x4— 0, x4— 500 


= -24, I2 1l 


* д=ц=Б=6=1А, h=h=0A 


9. x1 = 1, x2—5, хз = 3, and x4 = 4; the balanced equation is СН + 503 — ЗСО + 4H30 
П. xy = x2 = хз = x4 = £ the balanced equation is CH3COF + H20 — СНзСООН + HF 
13. p(x) = х2 2x +2 
13, 13 

6*7 6" 


a. Using aj = Ё as a parameter, р(х) = 1 + kx + (1 =x? where — со <k < оо. 


15. р(х) 214 


b. The graphs for k = 0, 1, 2, and 3 are shown. 


True/False 1.8 
(a) True 
(b) False 
(c) True 
(d) False 
(e) False 


Exercise Set 1.9 


1. a [0.50 025 
0.25 0.10 


p. | $ 25,290 
$ 22, 581 
а. |0.1 0.6 04 


03 02 0.3 
04 0.1 02 


b. | $31, 500 
$ 26, 500 
$ 26, 300 


5. | 123.08 
202.56 


True/False 1.9 


(a) False 
(b) True 
(c) False 
(d) True 
(e) True 
Chapter 1 Supplementary Exercises 
1. 3x, = x3 + x4 = 1 

2x1 + 3x3 + 3x4 = —1 

3. 3,_ 1 9... d, 5 

х= os vm БЫ х= – 587 vim 2 X3-—8$, X4—t 
3. 2x; — 4х2 + x3 = 6 

—4x| + 3х3 = =1 

хә = х3 = 3 
Qed eee E 


8. fo 3,44, yl 454.3 
к= Ex+ ту, у at + 57 


7. х=4, yes z=3 
9. a a#0, b#2 


b. @#0, b=2 
с. a—0, b=2 
d. @=0, b#2 
1i. [0-2 
el 
13. а -1 3 -1 
х-| 6 0 1 
b. x» |l -2 
к= 71] 
с 213 _160 
37 37 
а е 
37 37 


15. @=1, b= ~ 2, с=3 


Exercise Set 2.1 
1. М = 29, Си = 29 
M15—21, Ср= ~ 21 
M13 = 27, Суз = 27 
Му = ~ 11, Су = 11 
Мэ = 13, Сэ = 13 
Мз= – 5, Сз = 5 
Ma, = = 19, Сз = – 19 
Маз = = 19, Сз = 19 
Мзз = 19, Сзз = 19 
3. а. Міз=0, Сіз=0 
b. Мәз = — 96, C33 = 96 
c, Мә = – 48, Сә = – 48 


22: 


59; 


E MS 
| 59 59 
9. а2— 5a 4-21 

11. —65 

13. —123 

15. A2 lor ~ 3 

17. A2 lor -1 

19. (all parts) — 123 


33. The determinant is sin2 + cos?f = 1. 
35. d2=đd1 +A 


True/False 2.1 
(а) False 

(b) False 

(c) True 

(d) True 

(e) True 

(f) False 

(g) False 

(h) False 

(i) True 
Exercise Set 2.2 
5. —5 

5-1 

9. 1 

11.5 

13. 33 

15. 6 

17. 72 


19. Exercises 14: 39; Exercise 15: 6; Exercise 16: -Б Exercise 17: —2 


21. —6 
23. 72 


True/False 2.2 
(a) True 

(b) True 

(c) False 

(d) False 

(e) True 

(f) True 
Exercise Set 2.3 
7. Invertible 

9. Invertible 


п. 
13. 
15. 


17. 
19. 


21. 


23. 


25. 


27. 


29. 
31. 
35. 


E TA 


Not invertible 
Invertible 
5-3: y 17 
ks 2 
ke-l 
3 —5 —5 
At=|-3 4 5 
2 -2 -3 
1 3 
2 2 1 
zi. 3 
A -|0 1 5 
1 
0 0 2 
-4 3 0-1 
21 2-1 0 0 
a= -7 0-1 8 
6 0 1—7 
3 2 1 
ТМГ qn 
30 38 40 
Ape apes ape ma 
Cramer's rule does not apply. 
y=0 
а. —189 
b. _1 
7 
c. _8 
F 
а. LL 
56 
È 7 


True/False 2.3 
(a) False 
(b) False 


(с) 


True 


(d) False 


(e) 
(f) 


True 
True 


(g) True 
(h) True 


0) 
(i) 


True 
True 


(k) True 


а) 


False 


Chapter 2 Supplementary Exercises 


—18 


24 
—10 


329 


. Exercise 3: 24; Exercise 4: 0; Exercise 5: — 10; Exercise 6: —48 

. The matrices in Exercise 1—3 are invertible, the matrix in Exercise 4 is not. 
. =b? 5b —21 

. —120 


17. 


19. 


l 
— N 
s» xl 


23. 


31 72 102 15 
329 329 329 329 


кә 
E 
ÉD op——————ÓM————qeo———————M BM 
1 
| n щл Uj 
I 
loa alw ano 
l — 
а 
|» vn [ro z|- N 


5, — 3.143, ao ea 
St Sy y 55 *$ 
29. 2 2 2 2 2 2 
(b) с-а Б а +Ь2-с 
ей = 2ас ТЕ 2ab 


Exercise Set 3.1 


1. à 


п. 


13. 


15. 


17. 


| PiP2= (721,3) 
oo 
. PjP2 = (—3,6, 1) 


a. The terminal point is B(2, 3). 


. The initial point is 4(—2, —2, — 1). 


а. u— (—1,2, — 4) is one possible answer. 


е о 


е 


.u—(7, = 2, — 6) is one possible answer. 


оа (1, —4) 


v —3u— (12, 8) 
2(u — Sw) = (38, 28) 


‚ Зу — 2(u + 2м) = (4, 29) 


—3(w—2u--v) = (33, —12) 
(—2u — v) – 5(v + 3w) = (37, 17) 


.(-1,9, = 11, 1) 


(22, 53, — 19, 14) 
(—15,13, —56, —2) 


.(—90, —114,60, —36) 


(-9, -5, -5, =3) 


. (27,29, 227, 9) 


a w-u-(-9,3, —3, —8,5) 
b. 2v -- 3u= (13, — 5, 14, 13, — 9) 


> ос R^ 


19. 


Р 


‚ —w+ 3(v ш) = (—14, —2,24,2,7) 


v-w=(-2,1, —4, —2,7) 


b. би+ 2v = (—10,6, — 4, 26, 28) 


21. y— 


‚ (2u — 7w) — (8v +u) = (—77, 8, 94, — 25, 23) 


81821 
3° O° 3° 3° 6 


23. a. Not parallel 
b. Parallel 


c. Parallel 


25. а= 3, b= —1 


27.61 = 
29. 21 = 
A 


a. 


2, c2= = 1, с3= 5 
1, c2=1, сз= = 1, c4=1 


(3 =l -1) 
2 X 2 


b. [23 |. 9 1 
4' 44 
True/False 3.1 
(а) False 


(b) False 
(c) False 


(d) True 
(e) True 


(f) False 
(g) False 


(h) True 


(i) False 


() True 


(k) False 


Exercise Set 3.2 


1, 


b. 


m 


11. 


= 


e 


a. 


ME. 


= 


vl = y83 

| + lvli = 17 + y26 
—2u + 2v|| = 2/3 
—3u — 5v + w|| = {466 


=] 


a. ||3u — 5v || = V 2570 


{За — 511 + | = 346 — 10/21- [42 


I| — 19111 = 2y 966 


k= —2 


EP 
T di 


u:v— —8, u:u—26, v: v —24 


u:v—0, u:u—54, v: v—21 


© |u- vl y14 
а= vll = 59 
а= vll = 677 


16) 


) 


5(—у + 4а — v) = (125, —25, — 20,75, —70) 
‚ -2(8w -- v) + (2а +w) = (32, — 10, 1,27, = 
т a a э 

1н 5у +2) +v [S 2, -12, -3, 


—5 v _f4 _3\) _ v _f_4 3 
Ilvll =>, im 2) ivi (-$. 3 


1 


=: 


Vis 


(1, 0, 2, 1, 3) 


13. 


15 
a cos f = ——— —: 0) is acute 
mu 
b. cos — A ois obtu 
yeas ; 0 is obtuse 
136 
© cog f= — === —— . 9 is obtuse 
{225ү 180 ' 
15. 
е acb— "TE 
2 
17. а. u- (v * w) does not make sense because y - w is a scalar. 
b. u* (v +w) makes sense. 
с. ||u* v|| does not make sense because the quantity inside the norm is a scalar. 
d. (u* v) — ||| makes sense since the terms are both scalars. 
19. a4af[4.3 
> 5 
Ll Pe NN 28 
542' 502 
(3:1 93 
47 2° 4 
И ж 5 
V55 ү55 ү55 ү55° ү55 
EE Cee 
962 
b. cos — — _3_ 
y10 
c. cos 0 — 0 
d. cos 0 — 0 
25. a, ау] 10, |fullllvl] = y 13/ 17 = 14.866 


b. 


c. 


ju |7, ПЫ = у 10014 == 11.832 
lu -v|—5, |hulliivll = (3) 02) = 6 


27. А sphere of radius 1 centered at (xp, yp, 20)- 
True/False 3.2 


(2) True 
(b) True 


(c) False 


(d) True 
(e) True 


(f) False 
(g) False 
(h) False 


(i) True 

(j) True 

Exercis 
1. 


e Set 3.3 
Orthogonal 


a. 
b. Not orthogonal 


с. 


a 


Not orthogonal 


. Not orthogonal 


a. Not an orthogonal set 
b. 

ё 
а. 


Orthogonal set 
Orthogonal set 


Not an orthogonal set 


"gr n) 


7. Yes 


39. 
41. 


2 =2(x+ 1) + Q -= 3)— (2+2) = 0 
. 2=0 

. Not parallel 

. Parallel 

. Not perpendicular 


а. 2 


TEGERE OE -0) 
Кот 1$; 113°" 13 


0 (The planes coincide.) 


(b) cos [E cos у=-©— 
м" 77577 Ту 


True/False 3.3 
(а) True 
(b) True 
(c) True 
(d) True 
(e) True 


(f) 


False 


(g) False 


Exercise Set 3.4 


1. 


11. 


13. 


15. 


17. 
19. 


Vector equation: (x, y) = (—4, 1) --£(0, — 8); 


parametric equations: x = —4, y = 1 — 8f 


- Vector equation: (x, y, z) = £( — 3, 0, 1); 


parametric equations: x = —3£, y —0, z—£ 


. Point: (3, — 6); parallel vector: ( — 5, — 1) 
. Point: (4, 6); parallel vector: (—6, — 6) 
- Vector equation: (x, y, z) = (—3, 1,0) -- £4(0, —3,6) --£2(—5, 1, 2); 


parametric equations: x = — 3 — 54), y = 1 — 3£1-- £2, Z= 64 + 213 
Vector equation: (x, y, z} = (—1, 1,4) -- £4(6, — 1,0) --£2(— 1, 3, 1); 


parametric equations: x = — 1 + 614 =), у= 1-4 + 362, z 2 4 + t; 


A possible answer is vector equation: (x, y) = £(3, 2); 


parametric equations: x = 3f, y=2t 


A possible answer is vector equation: (x, y, 2) = £1 (0, 1, 0) + £2(5, 0, 4); 


parametric equations: x -+ 509, y —t1,z = 413 
x1 = —Ss—í, X2—8s, х= 
3 19 8 2 


1 3 
= - = t 
m= 57-45 3^ x3 Tt 75+ st, хз=г, X478, X5 


21. a, (1,0,0) -5(—1,1,0) --£(— 1,0, 1) 
b. a plane in 23 passing through P(1, 0, 0) and parallel to ( — 1, 1, 0) and ( — 1, 0, 1) 


23. as Ж + у + 272 
—2х + 3y = 
b. a line through the origin in g3 
3 2 
C x——-Z2p y= - ёр z-—t 
x 5% y 5^ 2 
oe og m= -s+ ds, X225, X3 =Í 
€ хр= 1 25+ 1, X228, хз=1+ 
27. х= 1 - $s- h, x2=8, x3=£, x4= 1; The general solution of the associated homogeneous system is хү = — fs- 3 


particular solution 


True/False 3.4 
(a) True 
(b) False 
(c) True 
(d) True 
(e) False 
(f) True 


Exercise Set 3.5 


1, X3—8, X3—£, X4—0. A 


of the given system is х 


1. a, (32, =6, =4) 
b. (= 14, = 20, — 82) 


c. (27,40, = 
3. (18, 36, — 18) 
5. (73,9, – 3) 
7. 459 
9. y 101 


42) 


19. The vectors do not lie in the same plane. 


21. —92 
23. abc 


27. ái 


29. 2(v xu) 


True/False 3.5 
(a) True 
(b) True 
(c) False 
(d) True 
(e) False 
(f) False 


Chapter 3 Supplementary Exercises 


1. 


19. 


21. 
23. 
25. 
29. 


a. 2v = 2u = (13, — 3, 10) 
b. ||u + v +w] = 70 
c. y774 


* projyu = -#(2 —5, -5) 


е 


е. ur (vx w) = — 122 

f. (C—5v-4-w) x ((u* v)jw) = (— 3150, — 2430, 1170) 
a. jv — 2u = (—5, — 12, 20, —2) 

b. ||u -- | = y 106 

c. y 2810 


d. projyu = -i. 1, —6, —6) 
. Not an orthogonal set 


а. A line through the origin, perpendicular to the given vector. 
b. A plane through the origin, perpendicular to the given vector. 
c. {0} (the origin) 


d. Aline through the origin, perpendicular to the plane containing the two noncollinear vectors. 
. True 
‚5(—1,—1,5) 
. | 14 
17 
pl. 


[35 


- Vector equation: (x, у,2) = (—2,1,3)+4,(1, —2, —2) £20, —1, —5y 


parametric equations: x = —2 464 + 53, y —1—214 =), z=3 — 240—542 
Vector equation: (x, y) = (0, — 3) --£(8, — 1); 


parametric equations: х = 8g, y= —3—1 

A possible answer is vector equation: (x, y) = (0, — 5) + £(1, 3); parametric equations: x =£, y = — 5 + 3f 
3(x +1) + 6(у —5) + 2(2-—6) =0 

-18(x —9) – 51у —24(z—4) =0 

A plane 


Exercise Set 4.1 


1. 


п. 


Тг 


(a) 9+ У = (2, 6), 3u= (0, 6) 
(c) Axioms 1—5 


3. The set is a vector space with the given operations. 
5. Not a vector space, Axioms 5 and 6 fail. 

Te 
9 


. The set is a vector space with the given operations. 


Not a vector space. Axiom 8 fails. 


The set is a vector space with the given operations. 


ue/False 4.1 


(а) False 
(b) False 
(c) True 
(d) False 
(e) False 


Exercise Set 4.2 


1 
3 
5 
7 
9 


(а), (е), (е) 
(а), (b), (d) 
- (a), (с), (d) 
- (а), (b), (d) 
- (а), (b), (с) 


п. . The vectors span 


a 
b. The vectors do not span 


c. The vectors do not span 


Qa 


. The vectors span 


13. The polynomials do not span 


15. a. Line; x = = 1, y 


z=t 


5 t 
b. Line; x 22, y t, z=0 

. Origin 

Origin 


. Line; x = — 3t, y= —2t, z—t 


> о 2 со 


М Plane; x — 3y bz=0 


True/False 4.2 
(a) True 
(b) True 
(c) False 
(d) False 
(е) False 
(f) True 
(g) True 
(h) False 
(i) False 
(j) True 
(k) False 


Exercise Set 4.3 


1. а. uz is a scalar multiple of uy. 
b. The vectors are linearly dependent by Theorem 4.3.3. 
c. рз is a scalar multiple of p1. 
d. Bisa scalar multiple of A. 

3. None 

5. a. They do not lie in a plane. 


b. They do lie in a plane. 
Т. 2 3 7 3 7 2 
(b) = Éy = ұз wot aye уз= — c Sy 
vi 7.3327 ууз Уз= 9L + 973, УЗ gut gu 


9, A— —l A- 
A z А=1 


a. They are linearly independent since v4, v2, and V3 do not lie in the same plane when they are placed with their initial points at the origin. 
b. They are not linearly independent since ұу, v5, and V3 line in the same plane when they are placed with their initial points at the origin. 
21. W(x) = —x sin x —cos x #0 for some x. 
23. a, W(x\=e" #0 

p. W(x) 2220 
25. W(x) —2 sin x #0 for some x. 
True/False 4.3 
(а) False 
(b) True 
(c) False 
(d) True 
(e) True 
(f) False 
(g) True 
(h) False 


Exercise Set 4.4 


a. A basis for #2 has two linearly independent vectors. 
b. A basis for R? has three linearly independent vectors. 


с. A basis for Рэ has three linearly independent vectors. 
d. A basis for M55 has four linearly independent vectors. 


3. (a), (b) 
7 a GH) s= (3, – 7) 


b. 3. 23. 
0 5= Р a) 
с. (м) у= (s. E) 
. б)8= (3, 22,1) 
‚ (05-7 (—2, 0, 1) 
п. A)g=(-1,1, = 1,3) 
13. A= А — 42 + А3 — Ад 
15. P =7p1 — 8p2 + 3p3 
17. а. (2,0) 


с ы 


True/False 4.4 
(а) False 
(b) False 
(c) True 
(d) True 
(e) False 
Exercise Set 4.5 
1. Basis: (1, 0, 1); dimension = 1 
3. Basis: (4, 1, 0, 0), (—3, 0, 1, 0), (1, 0, 0, 1); dimension = 2 
5. No basis; dimension = 0 
1 
TE 
b. (1, 1, 0), (0, 0, 1) 
с. (2, 21,4) 
- (1,1, 0), (0, 1, 1) 


a 


a.” 

b. (и + 1) 
2 

с. (и +1) 
2 


13. Any two of (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1) can be used. 
15. уз = (a, b, c) with 9g = 35 — 5c #0 
True/False 4.5 

(a) True 

(b) True 

(c) False 

(d) True 

(e) True 

(f) True 

(8) True 

(h) True 

(i) True 

(j) False 


Exercise Set 4.6 


п. 


[w]ls- К 


[w] s= 


с. а 
шыр” | 
2 


4 
(р) 5= (4, – 3,1), me^ 
1 


xj» Sl 


0 
(p)s—(0,2, —1), e| 1 
-1 


‚ w= (16, 10, 12) 
q=3 +4х? 


sa Tii 
e 73] 


ор 


e 


15. (а) 3 5 
-1 -2 
[2 5 
-1 -3 
O tis -[ 1] s - [71] 
© tels, - | 1| ten |. 1] 
17 а 5 
3 2 5 
1 
-2 -3 -j 
5 1 6 
b. _1 
9 2 
[#]в,=|—9|, [w]g; 7| 23 
—5 2 
6 
19. a.|cos20  sin20 
sin 20  —cos 20 
23. а. B= ((1, 1, 0), (1, 0, 2), (0, 2, 1)) 
b.pg-.](4 1 _2\ (1 _1 2 
a= (2, 5” 5) |5. P5) 
True/False 4.6 
(a) True 
(b) True 
(c) True 
(d) True 
(e) False 
(f) False 


Exercise Set 4.7 
1. ry=(2, —1,0,1), r2 (3, 5,7, —1), гз = (1,4, 2, 7); 


3. 


“Сан 


not іп the column space of А. 


Э 


е 


» 


F 


n 


| 
{ЕҢ 
EL 

| 


<Б 


d. |6 it 1 2 1 
5 5 5 5 5 
T 4 E 4 BE 
5 +s 5 +f 5 5 5 + 5 
0 1 0 1 0 
0 0 1 0 1 
E ou 1 Н 
гу = [102], 12 2 [001], є = B 1 
H H 
b. 1 23 
0 1 
= 01-300], = [0100], а=, e2=| 5 
0 0 
с. ri [1245], rj [01 230], 13 = [001 – 3], r4— [0001], 
1 2 4 5 
0 1 =з 0 
€1—|0]|, є2=|0|, єз = 1] e42|-3 
0 0 0 1 
0 0 0 0 
d. = [12 —15], r;2 [0143], з= [001 —7], r4 [000 1] 
1 2 =j 5 
сі = 0 сэ = 1 сз = 4 с4 = 3 
Toh lop 83 зр 7 
0 0 0 1 
th og: 1 2 
ry=[1 0 2]; r;2[0 0 1]; cy =] 0]; e22|1 
0 0 
b 1 =3 
| 0| 1 
n-[1 -3 00] = [0 1 0 0]; =| 55 e27| o 
0 0 
с. rji2[12 4 5]; r2=[0 1 23 0]; rz=[0 0 1 —3] 
[1 2 4 5 
0 1 -3 0 
r4=[0 0 0 1]; = |0; e22|0|; e—2| 1], c4=| -3 
0 0 0 1 
|0 0 0 0 
d r-2[12 -1 5];гз=[0 1 4 3]; r2 [0 0 1—7]; 
[1 2 -1 5 
22 dol LL l| af d 3 
ra—[0 0 0 1j а= 0 e27|o p 637 if eas E 
|0 0 0 1 
We зт =4 =3), (0,1, —5, —2), (0 0,1, -3) 
b. (1, —1,2,0), (0, 1,0, 0), (б. 0,1, -}) 
c. (1,1,0, 0), (0, 1, 1, 1), (0, 0, 1, 1), (0, 0, 0, 1) 


17. a.[3a —5a 
3b —5b 


b. Since A and В are invertible, their null spaces are the origin. The null space of C is the line 3x 4. y = 0. The null space of D is the entire xy-plane. 


1 for all real numbers a, b not both 0. 


True/False 4.7 
(a) True 
(b) False 
(c) False 


(d) False 
(e) False 
(f) True 
(g) True 
(h) False 
(i) True 
(j) False 


Exercise Set 4.8 

1. Rank(4) = Rank (A7) — 2 

3 а. 2;1 

b. 5;2 

©з 2:2 

d. 2;3 

е. 3:2 

5. a Rank — 4, nullity — 0 
p. Rank — 3, nullity — 2 

Rank = 2, nullity = 0 


гы 


а. Үеѕ, 0 
b. No 

. Yes, 2 
. Yes, 7 
e. No 

f. Yes, 4 
в. Yes, 0 


£o 


9, by =r, b3—s, Ъз = 4s — 3r, bg=2r—s, bs = 8s — Tr 
11. No 
13. Rank is 2 if » = 2 and = 1; the rank is never 1. 
17. а: 3 
b. 5 
€; 3 
d. 3 


19. 0 1 1.2 
^o o al 4 
True/False 4.8 
(a) False 
(b) True 
(c) False 
(d) False 
(e) True 
(f) False 
(g) False 
(h) False 
(i) True 
(j) False 


Exercise Set 4.9 

l. а, Domain: g2; codomain: g3 
b. Domain: 23; codomain: g? 
c. Domain: 23; codomain: g? 


d. Domain: д6; codomain: g! 
з. R2, КЗ, (1,2,3) 


5. a. Linear; R? _, R? 
b. Nonlinear; R? = EE 


c. Linear; R? _, R3 
d. Nonlinear; R4 —, д2 
7. (a) and (c) are matrix transformations; (b), (d), and (e) are not matrix transformations. 


9.13 5 =1 
4-1 15;7(—-12,4)—-(3, 2-2, —3) 


3 2-1 
H. afo 1 
-1 0 
1 3 
1-1 
».[72-11 
01 10 
-10 00 
c [ооо 
000 
000 
000 
000 
а. [оо от 
10 00 
00 10 
01 00 
10-10 
13. a, 7(-1,4) = (5,4) 


b. T(2, 1, 23) = (0, – 2, 0) 


15 a, (2-5,—3) 


(2,5,3) 
с. (72, 75,3) 


> 


17. a, (2,1,0) 
b. (— 2, 0, 3) 
(0, 1, 3) 


b. (0, 1, 292) 


с. (—1,—2,2) 
Е с 2 
(7242. 1,0) 


(1, 2, 2) 


d 


19. 


Em 


21. 


Em 


= 


ы 


25. 


wle sjo wl 


I 
wle w]e lo 


wl wl le 


29. 


EB 


. Twice the orthogonal projection on the x-axis. 


b. Twice the reflection about the x-axis. 


31. Rotation through the angle 29. 
33. Rotation through the angle Ө and translation by х0; not a matrix transformation since Xg is nonzero. 
35. Aline in R". 


True/False 4.9 
(a) False 
(b) False 
(c) False 
(d) True 


(e) False 


(f) True 


(g) False 
(h) False 


(i) True 
Exercise Set 4.10 


1 


11. 


13. 


15. 


a. 


=й 


s 


5 —1 21 -8 -3 1 


TgoT4—|10 -8 4| ТдоТв= | –5 —15 —8 


45 3 25 44 —11 45 


1 1 30 
neli anl i] 
3 3 5 4 
пот | E nen-[ 4 
T2(71 (x1, х2)) = (3x1 + 3х2, 6x1 — 2х2), 


T1(Fo(x1, х2)) = (5х1 t 4x5, x1 — 4x2) 


1 0 
0 —1 


a. Tq o То = T30 T] 
b. Tio T3— T30 Ti 


e 


йй ыо @ с C E 


A TioT52T30T| 


. Not one-to-one 
. One-to-one 
. One-to-one 
. One-to-one 
. One-to-one 


. One-to-one 


. One-to-one 
i 2 
One-to-one; i i Tw, жз) = (in - 
3 3 
. Not one-to-one 
^ 0 —1 RE 
One-to-one; i d ОТ (иу, м2) = (= м2, 


. Not one-to-one 


а. Reflection about the x-axis 


b. Rotation through the angle — 4 


т 


* Contraction by a factor of i 


. Reflection about the yz-plane 
. Dilation by a factor of 5 


ond 
3"? 3 


—w) 


|+ 


1 


3 


4 


17. 


с ры 


e 


19. 


. Matrix operator 
. Not a matrix operator 
. Matrix operator 


. Not a matrix operator 


а. Matrix transformation 


b. Matrix transformation 


21. 


233. 


H3 
ІЕЕ 
[| 


а. Pale) = (— 1,2,4), Т дез) = (3, 1,5), Тез) = (0,2, —3) 


b. Fale, + e2 + e3) = (2, 5, 6) 


25. 


. ТА(?ез) = (0, 14, —21) 


. Yes 


b. Yes 


27. 


29. 


a. 


b. 


(b) T(xi, хә) = (+ ха, x172) 


The range of T is a proper subset of 2”. 


T must map infinitely many vectors to 0. 


True/False 4.10 


(a) False 
(b) True 
(c) True 
(d) False 
(e) False 
(f) False 


Exercise Set 4.11 


7. Rectangle wit 


1 oor о н © 

е © о о о н 

О ©. 6 = © = © С =. © © кє © © кє © о 
—————————————————— 


= 


vertices at (0, 0), (—3, 0), (0, 1), (—3, 1) 


Б 


F 


ni 
0 


п. a. Expansion by a factor of 3 in the x-direction 
b. Expansion by a factor of 5 in the y-direction and reflection about the x-axis 
c. Shearing by a factor of 4 in the x-direction 
13. ж [1 
2 0 
0 5 
b.|1 0 
2 5 
c | 0 -1 
-1 0 
17. "E 2, 
b. Y= 
Сз == 
dnt 
d. y= —2x 
e. 8+ 
— 34543. 
11 
19. (уо 
7 [104 
01k 
00 1 
b. Shear in the xz-direction with 


factor k maps (x, y, Z) to (x + ky, y, z -- ky): 


oor 
Ay 
= о о 


Shear in the yz-direction with factor k maps (х, y, 2) to (х, у + kx, z + kx): 


True/False 4.11 


(а) False 
(b) True 
(c) True 
(d) True 
(e) False 
(f) False 
(g) True 


Exercise Set 4.12 


1. 


а. 


Stochastic 


b. Not stochastic 


c. 


Stochastic 


d. Not stochastic 


0.45455 


3. pee 


5 


а. Regular 


b. Not regular 


c. Regular 


Am 
ore © 


9.|4 
11 
E 
11 
23. 
11 
H. а. Probability that something in state 1 stays in state 1 
b. Probability that something in state 2 moves to state 1 
c. 0.8 
d. 0.85 
13. a, [0.95 055 
0.05 0.45 
b. 0.93 
c. 0.142 
d. 0.63 
15. di 
Year 1 2 3 4 5 
City 95,750 | 91,840| 88,243| 84,933| 81,889 
Suburbs| 29,250 | 33,160] 36,757| 40,067| 43,111 
b. 
17: а, 23 
100 
b. | 46 
159 
22 
53 
a 
159 
с. 35, 50, 35 
19. zo A l 1 
10 10 5 3 
shd An ge 
P=| 5 10 2 |з 
13 3 1. 
10 5 10 3 


21. Pky = q for every positive integer k 


True/False 4.12 
(a) True 
(b) True 
(c) True 
(d) False 
(e) True 
Chapter 4 Supplementary Exercises 

Lo (a) ut+v= (4,3,2), -u=(—3, 0, 0) 

(с) Axioms 1—5 
3. If s: 1, — 2, the solution space is the origin. If ; — 1, the solution space is a plane through the origin. If  — — 2, the solution space is a line through the origin. 


7. A must be invertible 


9. а. Rank = 2, nullity = 1 
p. Rank — 2, nullity — 2 
c. Rank = 2, nullity =» — 2 

n а Ji, x^. Y sos a where 254 = » if n is even and 2, = » — 1 ifn is odd. 
b. 


fx, х2, xi, Е xnl 


Оно ooo 


— о н бо 


15. РоѕѕіЫе гапК$ аге 2, 1, апа 0. 
Exercise Set 5.1 
1. 5 


* а. xM-2-3-0 


. M-83--1620 
c. 212-0 

d. 4-320 

e. M20 

f. M-223-1-0 


cose 


Basis for eigenspace corresponding to À — 3: 


Basis for eigenspace corresponding to À — 4: 


Basis for eigenspace corresponding to А = y 12: 


d. There are no eigenspaces. 


' Basis for eigenspace corresponding to А = 0: | 
' Basis for eigenspace corresponding to А = 1: | 


T. a. 1,2,3 
b. —/2,0, ү2 
c. —8 
d. 2 
e.2 
f£ —4,3 


a. A A3 33$ A220 
b. A283 + 19\? — 24A +48 =0 


П. а 2] [о -1 
A= 1:basis Я " я ; A= —2:basis : 
0 1 0 
b 3 
2 
A-4:basis| 1 
0 
0 
3.) (1. ж 
1, (2) = 51, 2 =512 
15. а. y =x and y = 2x 
b. No lines 
с. y=0 


True/False 5.1 
(a) False 
(b) False 
(c) True 


EB 
рү 


— Ml = |н 


; A= — l:basis 


0 000 000 
0,[001,/|000 
0 010 001 
| ; : 0 
; basis for eigenspace corresponding to À — — 1: 1 
NECS NR 
12 |; basis for eigenspace corresponding to ^ — — {12 : {12 
1 1 


—2 
1 
1 
0 


(d) False 
(e) True 
(f) False 
(g) False 


Exercise Set 5.2 

1. Possible reason: Determinants are different. 
3. Possible reason: Ranks are different. 

5, A=0:1or2; А= 1:1; А= 2:1, 2, or 3 
7. Not diagonalizable 

9. Not diagonalizable 
11. Not diagonalizable 


13. [1 
ede a 
[11 E 
15. [-2 0 1 300 
P=| 0 1 0|; PAP=|0 3 0 
[| 100 002 
17. [121 100 
P=|1 3 3|; Р\АР=|0 2 0 
|134 003 
19. [100 000 
P=| 010|24P-|0 0 0 
[-3 0 1 001 
21. [100 0 =2 0010 
Eti sil 54,5, | 0-200 
P-loo1 of ? “=| oso 
[000 1 0 003 
23. [—1 10237 —2047 
0 1 0 
0 10245 —2048 
25. 1 1 1 
1 1 mo olé 3 6 
A"=Pp"P1=|2 o —1||0 3" 0 2 0 =: 
1 -1 1lo o 4” * d. 4 
373 3 
л л Ка ша ола in Exercise 20 of Section 5.1 
n possi шу 15 = а— а= А where 1 anı 2 are as 1n Exercise о ection 5.1. 


33. a. A=1:dimension = 1, А = 3: dimension < 2, А = 4: dimension <3 
b. Dimensions will be exactly 1, 2, and 3. 
c A=4 


True/False 5.2 

(а) True 

(b) True 

(c) True 

(d) False 

(e) True 

(f) True 

(g) True 

(h) True 

Exercise Set 5.3 

1.ü—(2--i, —4i, 1—i), Re (u) = (2, 0, 1), Im (а) = (-1,4, 1), lul = {23 
5 x—(7—6i, —4—81, 6 — 12i) 

7. a-L^. M Re (=|) Hi in (4) =| 7! | det(A) =17—1, (4) =1 
jj, urv= = 1 +i, u:w—18—7i v-w= 124 6i 

13. -11- 14i 


а. k= = 8; 
b. None 


True/False 5.3 
(a) False 
(b) True 
(c) False 
(d) True 
(e) False 
(f) False 


Exercise Set 5.4 
1. 


-X 


а. yy =c1e” — 2c 


уз =с1е°* + се" 


а. у= — + cae” 


y2=cje" + 2сде° —сзе% 
уз= 2сзе?* – cae” 
b 2 2 3x 
- y=" —2e 
y22e- 22% + 227" 
уз= = 202% 4 2е?^ 


7. y 2 сце?" ege ?* 


9. y — cie" +e” + cae?" 

True/False 5.4 

(a) False 

(b) False 

(c) True 

(d) True 

(e) False 

Chapter 5 Supplementary Exercises 


1. (b) The transformation rotates vectors through the angle 8; therefore, if () < 0 < т, then no nonzero vector is transformed into a vector in the same or opposite 
direction. 
3 (о [110 
021 
005 
9 A? 15 30 : Be 75 150 | А%— 375 750 І д5 1875 3750 
5 10 25 50 125 250 625 1250 
п. 0, (А) 


13. They are all 0. 
15. 1 0 0 


11 
-1 -3 -2 
1-1 _1 


17. They are all 0, 1, or —1. 


Exercise Set 6.1 
1. 


Wo 
p. —6 
с. —3 
d. 13 
e. y5 
t. узэ 
3. а. 2 
b. 11 
& =13 
d. -8 
е. 0 
5. a 5 
Ь. 1 
с. 77 
а. 1 
е. 1 
fd 
7. 3 
. 56 
9. (b) 29 
HV. Г о 
0 5 
b.|2 0 
0 y6 
13. à у 
b. 0 
15. а 105 
b. [47 
17. (p. чу= 50, 1011= 6/3 
19 а. 302 
b. 3/5 
с. 3y 13 
Zh o a 
b. 


zh к= | JA V} = —2 < 0, so Axiom 4 fails. 


True/False 6.1 
(a) True 
(b) False 
(c) True 
(d) True 
(e) False 
(f) True 
(g) False 


Exercise Set 6.2 
1. 1 


a. —— 


f 
b. ~ 3. 
[з 
‚0 
d. _—20_ 
910 
e. 1. 
үз 
f. 2 
|55 
3 4.19 
107 
b. 0 
7. No 
9 a, k= —3 
К=—2,—3 
13. No 
15. а. х=, у= —2tz-— – 31 
b. 2x = 5y -- 4z — 0 
c, x=z=0 
31. 


а. The line y = = x 
b. The xz-plane 


c. The x-axis 


True/False 6.2 
(a) False 

(b) True 

(c) True 

(d) True 

(e) False 

(f) False 
Exercise Set 6.3 
1. (a), (b), (d) 
3. (b), (9) 

5. (a) 


9 


5 


уз + 4v3 


= 


5 


5 


v3 + буз + 


11 
10 


inn 


(b) u= — 


11. 


e 
2 
mjn 0 
| [= 
s =$ 
omn 
1 + 
N 
в 5 


14 

3 
ENS 

m 


а. w= 
bw 


13. 


DER 
ERE 
= € 


1 ! 
1 E 
күб с 
-—|e esr 
1 l 


А А 


e сч 
Ed E 
е 
ая 
l | 
vt omisi 
| І 
үч war 


b. у= (1,0), v2=(0, – 1) 


2t J 
-~ 
15 
| -| — 
= ШЕЕ 
a -[& e|. 
E А 1 
© А 
| aS -|£ 
ro Il —Á 
Et сз ll 
> ex 
I = * 
1 а= 
-2 1 
> ах 
о i 
ed 4K os 


o 1 |02 3у2 
di 

1 1 

49 үз 

— 
234 

ai n | 3 

3 234 |, | 

2 7. 3 

3 (234 

do 11 

s nns 

Edu 

1 1 | 1[|00 + 

2 B Ys V6 

д. 2 з |, £0 X2 

{2 {з у 2 ү 
32 1 flo o = 
yi9 у үлә | 


f. Columns not linearly independent 
33. vi — 1, у= J/3(2x 1), va = J5(6x? — 6x + 1) 
True/False 6.3 


(a) False 
(b) False 
(c) True 
(d) True 
(e) False 
(f) True 


Exercise Set 6.4 


1, 7 


b. 


% à 


Dus 
Lj 


= 


Хр 5, х2 = 


* Solution: х = | 


21 25|[x1] |20 
E 5] - [20] 
15 -1 5 |[х1 E 
-1 22 30 Hi 9 
5 30 45]||*3 13 


i 
2 


,X1—12, x32 —3, x3-9 


| njo polo 


о 
ll 
I 
шы © ш ш w 


i. 3 least squares error: 4/5 
2 
T 


` Solution: x = | o + £(—3, 1) (t a real number); least squares error: 442 


© Solution: x = (- 2, o -F£(—1, — 1, 1) (fa real number); least squares error: 1 294 
9. а,(7,2,9,5) 
b. (- 12 _4 12 2) 
э F 59795 
LIE det (A T 4j = 0; A does not have linearly independent column vectors. 
b. det (A 7 4) = 0; A does not have linearly independent column vectors. 
13. а 100 
[2] = 10 0 0 
00 1 
b. 000 
[P] 2|0.1 0 
00 1 
15. a, (1,0, —5), (0,1,3) 
b. 10 15 —5 
[P]- d 15 26 3 
-5 3 34 
c. { 2x0 +3yo=zo 15xo+26yo+3zo — —5xg-- Зур + 3420 
7 Е 35 i 35 
d. 3135 
7 
17. s=t=1 


21. [p] = АТ(ААТуУ А 


True/False 6.4 
(a) True 

(b) False 

(c) True 

(d) True 

(e) False 

(f True 

(g) False 

(h) True 
Exercise Set 6.5 


bm 
У z t5 


3. у= 246 5x — 3x? 


True/False 6.5 
(a) False 
(b) True 
(с) False 
(d) True 


Exercise Set 6.6 


l. a, (+a) = 2 sinx = sin 2x 
b. x3 5|e: sin2x | sin 3x sin их 
(+n) 2f sin x + к зк Q0 o, Sar | 
3 а 1 1 
2ta 
b. 13 + l+e 


Eis 


ЕТ) 


b. 12.9. 
1 2 


vl 8 1 Гуе] 
РЬ (= 1) “е 

True/False 6.6 

(а) False 

(b) True 

(c) True 

(d) False 

(e) True 

Chapter 6 Supplementary Exercises 
па. (0,2, a, 0) witha #0 


b. 
а 1022500 
ys {5 
3. а. The subspace of all matrices in M 55 with only zeros on the diagonal. 


b. The subspace of all skew-symmetric matrices in M 55. 


^. 1 1 
+ |—=,0, = 
Е Б) 
9. No 
11. 


я 


(b) g approaches 2 


17. No 


Exercise Set 7.1 


E wj 4.9 12 
5 25 25 
о $3 
.3 _12 16 
5 29:25 
3 uH 
0 1 
o[ 21 21 
y2 {2 
xd. И 
y2 y2 
wf i pg. 
y2 y2 
Sb. 12. 1. 
ye yo Je 
1 1 1 
B үз ys 
(е) 


I 
Ale Ale сул горе 


Ale Al ey— w]e 


Mle Mle Mle Mle 
AA Ale Ale rol 


7 а (—14+343, 3+ 3) 
b. (2 - үз, 2434 1) 


5v afal i 5.1 
| 273/92. 5-73 s) 


b. {1_3 _3_ 
| 3,6, -2 


0 1 0 
sinf 0 созӣ 
b. 1 0 0 

А=|0 cos@ sind 
0 =sinf cosh 


п. а. p 0 =siné 


w ne 


13. a? +b? = 4 


os 


Е E 
(e ee 


а. Rotations about the origin, reflections about any line through the origin, and any combination of these 


17 


` The only possibilities аге = 


21. 


b. Rotation about the origin, dilations, contractions, reflections about lines through the origin, and combinations of these 


c. No; dilations and contractions 


True/False 7.1 

(a) False 

(b) False 

(c) False 

(d) False 

(e) True 

(f) True 

(g) True 

(h) True 

Exercise Set 7.2 

1. a. \2— 5\ — 0: А = 0: one-dimensional; 4 — 5: one-dimensional 

b. 43 — 27А — 54 — 0: А = 6: one-dimensional; V = — 3: two-dimensional 
c. A3 2342 — 0. А = 3: one-dimensional; \ — 0: two-dimensional 
d. 42 122 + 36А — 32 — 0; А = 2: two-dimensional;  — 8: one-dimensional 
e. AS 8A? = 0: A = 0: three-dimensional; д = 8: one-dimensional 


f£ Mg + 22\2 — 24А 4-9—0; А = 1: two-dimensional; 4 — 3: two-dimensional 


* [o2 үз 
3 ү? 
РЕ PO papei | 
үз 2 0 10 
үт үт 
5. [4 9 2 
5 5 25 0 0 
P-| 01 0l PlaP—-| 0-3 0 
3954 0 0 —50 
| 575 
ES T4 4. 4. 
ys y2 ү 
NE ыр 
ys y2 ¥6 ae 
L o x 
үз V6 
9 [243 
-$$ 00 
3.4 0 0 —25 0 0 0 
_| 5 5 aal 0 25 0 0 
pe sdb mt Жү 0 -25 0 
5 5 0 0 о 25 
o0 ii 
15. No 
19. Yes 


True/False 7.2 


(а) True 


(b) True 
(c) False 
(d) True 
(e) True 
(f) False 
(g) True 
Exercise Set 7.3 
"omen f 
0 7112 
ea t SE 
—3 —9|]5^2 
c. 9 3 =4 
ХІ 
[x1x2%3]| ? 7! AH 
x 
-4 i 4| 3 
E 


9; 


п. 


13. 
15. 
17. 


19. 
21. 
23. 
27. 
31. 


‚ 2x? + 5y? = бху 


d. 
E. tt 
/2 


Nj 2 2 
К 0 = 3у{ +y 


22 1 
xi 3 3 3 Iryy 
xj|- 2 i 2 y2|, Q=y? +43 +793 
x3 12 _2 УЗ 
L 3. 3 3 
a 2 1 
2 |[х x 
[x у] 1 [;]+1-16[;]+2= 
= 0 
2 
b. 0 O}[x x 
кол ИЕ НЕЕ 


а. еШрѕе 
b. hyperbola 
c. parabola 


d. circle 


Hyperbola: 4 (xr)? — (yr)? —3; 6 = 36.9? 


а. Positive definite 

b. Negative definite 

c. Indefinite 

d. Positive semidefinite 


е. Negative semidefinite 


Positive definite 
Positive semidefinite 
Indefinite 
k-2 
a. 1 E 1 
n n(n—l) 
mI =, i 
А= (п = 1) n 
1 1 


Taal) —n(xn—1) 


b. Yes 


Hyperbola: 2(yr)? — 3(7)2 — 8; Ө – 26.6° 


33. A must have a positive eigenvalue of multiplicity 2. 


True/False 7.3 
(a) True 
(b) False 
(c) True 
(d) True 
(e) False 
(f) True 
(g) True 
(h) True 
(i) False 
(j) True 
(k) False 
(1) False 


Exercise Set 7.4 

. Maximum: 5 at (1, 0) and ( — 1, 0); minimum: — 1 at (0, 1) and (0, — 1) 
. Maximum: 7 at (0, 1) and (0, —1); minimum: 3 at (1, 0) and (-1, 0) 

. Maximum: 9 at (1, 0, 0) and (-1, 0, 0); minimum: 3 at (0, 0, 1) and (0, 0, –1) 
- Maximum: z = 4y 2 at (x, y) = (202, 2) апа (—2{2, - 2); minimum: z= — 407 а(х, y) = (- 202, 2) апа (202, - 


ч л U m 


е 


13. Critical points: (—1, 1), relative maximum; (0, 0), saddle point 
15. Critical points: (0, 0), relative minimum; (2, 1) and (—2, 1), saddle points 


17. нй ы у 
Corner points: * = ү?” к= 
2 /2 


21. g(x) =A 
True/False 7.4 
(a) False 

(b) True 

(c) True 

(d) False 

(e) True 


Exercise Set 7.5 
1. [2 4 | 


14i 3-i 0 
3 1 i 2-3 
А=| =i -3 1 
243 1 2 
5; а. 413 * 3] 
p. 422 * 222 
9. з 4 
* = 5 5 
А“ =А\ = 
-Aj -% 
5 5 
0, -ie43  l-i 


түз 

` 2y2 2/2 
Е 1-03 =i- y3 

2/2 2y2 


13. —1+: 1-i 


15. zie dd 
6 үз 20 
PEDE. ол »-[; | 
ye үз 
17. [0 0 1 
2. ott 9 -200 
p-| 46 We р=| 010 
їн 2 4 005 
ye 6 
19. | 0 i 2-8i 
A-| i 0 1 
9361. 8 


21. а. 413% = 231 
p. 41% 21 


29. (с) B and C must commute. 
о о ш 

y2 ү 

ре а 

yag ү 


39. Multiplication of x by Р corresponds to llull? times the orthogonal projection of x onto W = span {u} . If ||| = 1, then multiplications of x by H = 7; — Qu" 


corresponds to reflection of x about the hyperplane u+. 


True/False 7.5 
(a) False 
(b) False 
(c) True 
(d) False 
(е) False 


Chapter 7 Supplementary Exercises 


„з 4 34 
5 75| | 55 
4 3| | аз 
> 5 3.3 
bf 45 _3) [4.9 12 
5 5 5 25. 25 
9 4 2| |o 4 3 
25 5 25 5 35 
12 3 16| | 3 _12 16 
25 5 25 5 25 25 
5. 1 1 
з үз °* 


7. positive definite 


9. a. parabola 


b. parabola 


Exercise Set 8.1 
1. Nonlinear 
3. Linear 


5. Linear 


а. Linear 

b. Nonlinear 

9. T(xj,x3) = (= 4x1 + 5x2, хү—3х2); T(5, —3)=(—35, 14) 

11. Z(x1, x3, x3) = ( —z4 4х) = 23, 5x1 — 5x3 — xa, X1 + 3x3); Т(2,4, – 1) = (15, 29, = 1) 
13. T(2v1 — 3v2 + 4v3) = ( — 10, — 7, 6) 

15. (a) 

17. (a) 

19. (a) 

& (4) 

b. (1,0, 0), (0,1,0), (5. —4, 1) 


с.х, х2, xi 


23. a [1] f- 


b. |—14 

19 

11 
c. Rank(T) — 2, nullity(7) = 1 
d. Rank(A) = 2, nullity(.4) = 1 


ЕТ 


b. |—1 —4 
-1 2 

1 0 

0 7 


c. Rank(7) = nullity(7) = 2 
d. Rank( A) = nullity(.4) = 2 


27. a. Kernel: y-axis; range: xz-plane 
b. Kernel: x-axis; range: yz-plane 


c. Kernel: the line through the origin perpendicular to the plane у = x; range: plane y = x 


79. а. Nullity(T) = 2 
b. Nullity(T) =4 
c. Nullity(T) = 3 
d. Nullity(T) = 1 
31. аз 


b. No 
33. A line through the origin, a plane through the origin, the origin only, or all of R3 
35. (b) No 
41. ker(D) consists of all constant polynomials. 
95 атоо) =) 
b. TG @)) = у(х) 


True/False 8.1 
(a) True 
(b) False 
(c) True 
(d) False 
(e) True 
(f) True 
(g) False 
(h) False 
(i) False 


Exercise Set 8.2 


п. 


a. Кег(7) = {0}; Tis one-to-one 


[za 


a © 


* ker(7) = un 3, iy} T is not one-to-one 


- ker(T) = {0}; Tis one-to-one 
- ker(7T) = {0}; Tis one-to-one 
. ker(7) = {%(1, 1)} ; Tis not one-to-one 


f. ker(T) = {8(0, 1, — 1)) ; Tis not one-to-one 


a. Not one-to-one 


b. Not one-to-one 


. One-to-one 


а. ker(7) = (k( — 1, 1)) 


7 


a 


d. a 
Т(а +b sin(x) +c cos(x)) = H 


. T is not one-to-one since кег(7) # (0) . 
. T is one-to-one 
. T is not one-to-one 
. T is not one-to-one 
. T is one-to-one 
a 
la b c b 
bd e||-|; 
T e =|4 
ce f e 
/ 
а а 
[a b b a b с 
T^ аер = 
(2 21) 141 2) 
d d 


a 
ў Tax? + bx? + cx) = H 
c 


[^ 


13. T is not one-to-one since, for example, f (x) = z? (х= 1)? is in its kernel. 


15. Yes; it is one-to-one 


17. Tis not one-to-one since, for example a is in its kernel. 
19. Yes 


True/False 8.2 
(a) False 
(b) True 
(c) False 
(d) True 
(e) False 
(f) False 


Exercise Set 8.3 


1; 


an C E 


с ы 


» (T20 T1) (х, y) = (2x — 3y, 2x + 3y) 

- (T20 Ту) (x, у) = (4x — 12у, 3x – 9y) 

» (T20 Ту) (х, у) = (2x + 3y, x — 2y) 

» (T20 Ту) (х,у) = (0, 2x) 

„а+4 

- (T3 0 Ту) (4) does not exist since Tj (A) is not a 2 x 2 matrix. 


5 Туф) = 1+ 


п. 


а 


. T has no inverse. 


1х\ + lxj-Óx4 


xi 8 8 4 
a ed ada 
T Hi 81+ 822 + 43 
Зра 
guit gta t 4x3 
& ГЫ c 
[а] le 
d =| ol, 41541 
T |x2|= 01+ 52+ 943 
[з] haio pind 
91+ 551—%®3 
d. [xi 3x1 + 3x2—%3 
p n|- —2x| = 2x2 + X3 
[ ^3 —Ax| — 5x3 + 2x3 


13. а. ay 0 fori —1,2, 3,..., и 


ES! 1 1 1 1 
"TO (x1, Z2 X3, --» Xn) = (л. 2, 203,0... Xn 
1 и 


їз. 


x 
1. (ау(1,—1) 
(d) 771(2,3) =24x 


21. а. оТ = ТоТ 
Ь. Тро Тә # Тзо Т 
c. ТоТ = ТоТ 


True/False 8.3 
(а) True 
(b) False 
(c) False 
(d) True 
(e) False 
(f) True 


Exercise Set 8. 
A a 


4 
0 
0 
0 
1 


а. 


wilco ro] o 


— 


1 
24 
0 4 
10x + 16x? 


AE. 
0 
0 
= 
тодів |2 trooie=[5| 


тор 5) тоо) 


E- i as] 


= 


Уе чїй 


* Tj Q6) = РКО, Ту! (р(х)) = р(х — 1); (Тао Тү) (6) = 196 - 0) 


co уе 
[83 [c 


п. а. 1 3 -1 
[7(v))]1g2 2 |, [7(v3]g-2 | 0}, [T(vnlg-| 5 
6 -2 4 


са 


T(vi) = 16+ 51x + 192, T(v3) = —6— 5х + 5х2, Т(уз) =7 +40x + 15x? 


с. т@ aix +ау®) —..239ag — zum + 28942 + 201ag — His + 247432 x4 ani + 107432 х2 


a 


А 7(1 +z?) =22 + 56x + 14x? 


13. а 0 0 000 
6 0 зоо 0 
[F207] ei p= ‚ [Т2]в'в"= [71]в"в=|0 —3 
0 —9 030 0 0 
0 0 00 3 
b. [T20 Til gi p = [Т2з]в',в"[Т1]в"в 
1. a [00 0 
00 -1 
0 1 0 
ь. [0 0 0 
010 
002 
с. |210 
022 
002 
d. 210 4 14 
14e?* — 8хе?^ — 20х202 since | O 2 2 6|=| -8 
0 0 2]|| —10 —20 
21. a. Br, Bu 
b В' Blu 
True/False 8.4 
(a) False 
(b) False 
(c) True 
(d) False 
(e) True 
Exercise Set 8.5 
1. _ ud. 56 
1 -2 11 11 
[e=] o 1 [Т]1в,= 2 ж 
11 11 
3. jt. В 13_ __25 
SUR | LU mz n 
[7157 CNET [71g = 5 9 
2 ү? п 12 
5. [100 100 
[7]g2|0 1 0|, [7] 2|0 1 1 
[0 0 0 0 00 
7 [2 _2 
3 9 
[7157 1 al ъ= | 
|2 3 
п. а 1 1 
a- Is) Lal} 


13. 4 A= =—4, A=3 


b. Basis for eigenspace corresponding to A = —4: – 2 + 8, -+ х2; basis for eigenspace corresponding to \ = 3: 5 2 2x + x? 


21. The choice of an appropriate basis can yield a better understanding of the linear operator. 
True/False 8.5 

(2) False 

(b) True 

(c) True 

(d) True 

(e) True 

(f) False 

(g) True 

(h) False 


Chapter 8 Supplementary Exercises 


1. No. (ху + x3) = A(x] ++ x) + В + (Ax, + B) + (4x5 + В) = T(x1) + T(x5), and if z 2 1, then Tex) =cAx+ Bec(Ax + В) -cT(x). 
5. 


Em 


. T(e3) and any two of T(e;), T(e2), and T(e4) form bases for the range; ( — 1, 1, 0, 1) is a basis for the kernel. 
p. Rank — 3, nullity — 1 


7. а. Rank(T) = 2 and nulity(T) = 2 


b. T is not one-to-one. 


п. Rank = 3, nullity = 1 
13.]10 0 0 
0010 
0100 
0001 

15. -4 0 9 

[1= 10 =2 

0 1 1 

17. 1 -1 1 

[Т1в=|0 1 0 

1 0 —1 


19. (p f()-7z,  (х)=1 
(©) fG) =e", g(x) =e* 


21. (d) The points are on the graph. 
25.10 боо «=. 0 
10 0... 0 
050 0 
0 0 i 0 
000 PESE 


Exercise Set 9.1 

1. 3122, x2=1 

3, X1 =3, x3— —1 

5 xj— —1l, x3—-1, x3=0 

7. х= —1, x2= 1, x3=0 

9, X1 = = 3, x2= 1, x3=2, x4=1 
11. 


* 20 0111-2 
A=LU=|-21 01], 5 1 
201, 5 | 

p. 100][200]1 2 -1 
A=LDU =| <1 1 o|lo 16 

отоо 1 

00 1 


с. 10021-1 
А=1305=|-1 10100 1 
10100 1 
13. 1 00[|30011 —4 2 
А=|0 10020 10 
2 -2 111001 0 1 
15. 4,221 х= 214 ,,-12 


True/False 9.1 
(2) False 

(b) False 

(c) True 

(d) True 

(e) True 
Exercise Set 9.2 
1. а. Аз dominant 

b. No dominant eigenvalue 


3 | 098058] „Г 098837] . | 098679] | | 058715] 
1%] 2919612 | 27 | 2015206 * 9^ | 2016201! "^7 | 2915977 |’ 


dominant eigenvalue: \ = 2 + y10% 5.16228; 


T 1 1 
dominant eigenvector: |; Т] = m я 
та ЕІ; АО баа Ex ALES. xim Еи: XO æ 6.60550; 


1 
хд inca! A æ 6.60555; 


dominant eigenvalue: \ = 3 + yi3% 6.60555; 


3 


mo | 26-4413 | [047186 
ominant eigenvector: 24 hs = 0.88167 
26A 13 


ж Т 
15) 20.5} 9^ |-og| 2% | 20.929 


b. AD 228, А 2.976, А 2.997 


Ў . | : : 1 
* Dominant eigenvalue: 4 = 3; dominant eigenvector: | 


d. 0.196 
9. 0.99180 
2.99993; 

i ЕЯ 

13. а. 1 
Starting with | 0 |, it takes 8 iterations. 

0 

b. 1 
Starting with : , it takes 8 iterations. 

0 


Exercise Set 9.3 


1. 1 2 
һю=|2/, ap—|0 
2 3 
3 0.39057 0.60971 
hy = | 0.65094 |, а; = 0 
0.65094 0.79262 


5. Sites 1 and 2 (tie); sites 3 and 4 are irrelevant 
7. Site 2, site 3, site 4; sites 1 and 5 are irrelevant 
Exercise Set 9.4 
1. а. pe 0.067 second 

b. = 66.68 seconds 

c. m$ 66, 668 seconds, or about 18.5 hours 
3. 4, r2 9.52 seconds 

p. = 0.0014 second 

c. F29.52 seconds 


d. = 28.6 seconds 


5. а. 6.67 x 107 s for forward phase, 10 s for backward phase 
b. 1334 
7. 2 flops 


9. 233 =n? flops 


Exercise Set 9.5 


1. 0, y5 
з. (5 
е Е 
d= 2 ИГ? all | 
1 Lilo foo! 
5 0. 
И Е 1 a 2 
a Ele of EE 
A 2 10212 1 
5 ys y5 ¥5 
A 21 2 
2 6 zd dE 
А 1 242 a2 0 y2 {2 
713 0-7 уо 9| 1 1 
21 E° HR R 
3 y2 6 
% UD dL. g < 
Bo Y. 

к ale, d 10 
4-| ү e|? ДНИ 
sd. а LIL? 2 

y3 {2 Je 
True/False 9.5 
(а) False 
(b) True 
(c) False 
(d) False 
(e) True 
(f) False 
(g) True 


Exercise Set 9.6 


wwf wiw 


_2 
3 
К А a 
A alB olmo 
xa vale | 
1 1 
B у 
5. 2 
Sor a3 
ALE us 
-2 E 
3 
Re 1 = 
6 | 
үз ке ы 
-Б |2 


9. 70,100 numbers must be stored; А has 100,000 entries 
True/False 9.6 

(a) True 

(b) True 

(c) False 


Chapter 9 Supplementary Exercises 


1| 20—31 
[2 sll 2] 
0 
0 
2 


3.|2 0 1:2: 3 
12 012 
11 001 


a. 1 
\=3 E 
mda fe 
/2 
һу, 07100]... [07071 
57107041" "0.7071 
РЕ AR 
5% 10.9918 
E id. а л. 
20 
2 2 
0 1 ollo o A 
xL opo s|-ee.- 
y2 ү? 
п A. 1 
2 2 
32.06] 1.1 2.1 2 
4 -810| |2 72 |[24 0]3 73 3 
4 -8 10 1 1]|0 1212 2 1 
2 0 6] |2 2 E MN. 
i 1 
2 2 


Exercise Set 10.1 


1. а. у—=3х—4 


b. у= —2x +1 


2. a х2 py? dz =y +4 =0 or (x —2)2+ (у —3)2=9 
b. x4 y? 2x 4у — 20 =0 or (x +1)? + (y = 2)? = 25 


3. x? 4 2xy 4+ y? — 2x + у = 0 (a parabola) 
4 a х+2у+2=0 
b. =x+y—2z+1=0 
5. alx у z 0 
x1 ¥1 z 1 
х2 уз 22 1 
хз уз z3 1 


c 


хф 2у +2= 0 =x+y—2z2=0 
8 a xP py? z2 22x dy 2225 — 2 ог (х 1)? (у -2)) (2 1)? =4 
b. x? y? 4.2? 2х — 2y =3 or (x - 13 (у 13 +227 55 


10. |y x? xy | 


wi Хр ox 1 

y2 x5 x2 1 

y3 т x3 1 
11. The equation of the line through the three collinear points 
12, 020 
13. The equation of the plane through the four coplanar points 


Exercise Set 10.2 


1. хү=2,х2= 2; maximum value of z — 2 

2. No feasible solutions 

3. Unbounded solution 

4. [Invest $6000 in bond A and $4000 in bond B; the annual yield is $880. 
5. 7 


. 5 cup of milk, 2 ounces of corn flakes; minimum cost — 25. = 18.68 


6. а. x1 > 0 and x4 > 0 are nonbinding; 2х | + 3х3 < 24 is binding 

b. x1 —x2 =v for y < — 3 is binding and for y = — 6 yields the empty set. 

с. X2 € v for y æ 9 is nonbinding and for y æ Q yields the empty set. 
7. 550 containers from company A and 300 containers from company B; maximum shipping charges — $2110 
8. 925 containers from company A and no containers from company B; maximum shipping charges = $2312.50 


9. 0.4 pound of ingredient A and 2.4 pounds of ingredient B; minimum cost = 24.88 
Exercise Set 10.3 


1. 700 
2. a. 5 
b. 4 
4. а. Ох, a units; sheep, 2 unit 


b. First kind, Ж measure; second kind, 4 measure; third kind, x measure 


5o a х = Metio iuni „Xi == x1, i= 2, 3, A 


b. Exercise 7(b); gold, 305 minae; brass, 91 minae; tin, 141 minae; iron, 51 minae 


6. a 5х+у+2—К = 0 
x+7y+z-K = 0 
x+y+8z—K 0 
x 21t 14; 121 , К = t where t is an arbitrary number 


1317 7131777 131 
b. Take ¢ = 131, so that x = 21, y = 14.z— 12, K = 131. 


c. Take ¢ = 262, so that x = 42, y = 28.2 = 24. K = 262. 


a. Legitimate son, sni staters; illegitimate son, 4222 staters 


b. Gold, 305 minae; brass, 2 minae; tin, 141 тіпае; ігоп, 51 тіпае 


€. First person, 45; second person, 315 third person, 225 


Exercise Set 10.4 


a а S(x) = —.12643(x — 4)? — 20211 (x — 4)? + .92158(x — 4) + .38942 
b. 8(.5) = .47943; error = 0% 


3. a. The cubic runout spline 
b. S(x) = 3x3 — 2x? 45x41 

4. — .00000042(x + 10)? + .000214(x--10) + .99815, -10<x<0 
ous .00000024(x)) = .0000126(х)2 + .000088(х) + 99987, 0<х<10 
x)z 

— .00000004(x 10)? —  .0000054(x—10)? —  .000092(x 10) + .99973, 10<x<20 
.00000022(x — 20)?  —  .0000066(x —20)? —  .000212(x—20) + .99822, 20<x<30 
Maximum at (x, S(x)) — (3.93, 1.00004) 

5. .00000009(x +10)? —  .0000121(x--10)? + .000282(x--10) + .99815, —10<х<0 
"T .00000009 (x)? —  .0000093(x)? + .000070(x) + 99987, 0<х<10 
х)= 

.00000004(x — 10)? —  .0000066(x—10)? —  .000087(x—10) + .99973, 10<x<20 
.00000004(x — 20)? —  .0000053(x—20)? —  .000207(x—20) + 99823, 20<x<30 

Maximum at (x, S(x)) — (4.00, 1.00001) 

6 a 3 

Я = <x<0. 

Six) = H i» 0<х<0.5 

4x —l2x^--9x 1 05<х<1 

b. 2—2x 05<х<1 
х= 5-а 14х<15 

с. The three data points аге collinear. 

7. _ 

Ir TOS ...0001 Mi Yn-1 2» + X2 
1410... 000 0|| M2 Xi - y + әз 
0141... 000 0]] Мз |6] уз — 23 + ya 
кызра E : à? E 
0000 +--+ 01 4 1| M, > Yn-3 = 2¥n-2 + Yn 
1000 :-:: 001 4 Mya уна = Wn + yı 

8. 

(b) = уе 
2100 ооо 1]| 24 ni л + v 
1410 0000]| M2 yj = 2y2 + уз 
0141 0000] Ms |_ 6 ya —- D3 + ye 

i i д2 : 
0000 004 1| Ayi Yn-2 = Wat + yn 
0000 0 112] м, "ape р 


Exercise Set 10.5 
l. a xti Hi NOM Bi x р! NONI E NOM eee 


54 546 5454 54546 
b. 5 
P is regular since all entries of P are positive; q = E 
1 


а, 3 23 273 
х0 = |2 |, х= |52 |, xO = | 396 


72 
: М ; а |29 
P is regular, since all entries of P аге positive: 9 = 72 
21 
72 
3; a[9 
17 
8 
17 
b. | 26 
45 
19 
45 
ce|[35 
19 
A 
19 
12 
19 
4. & | 1 j т 
Р" = " ‚ #=1,2,.... Thus, no integer power of P has all positive entries. 
1-(2) 1 
(3) 
7 00 0 
В Р" f ; as n increases, so phy, H for any x©) as n increases. 
^ The entries of the limiting vector Н are not all positive. 
6. 111 1 
24 4 3 
2_|111 сы E: 
Pe 4724 has all positive entries; q 3 
d d o 1 
442 3 
7. 10 


13 
8. 5419 in region 1, 162% in region 2, and 291% in region 3 


Exercise Set 10.6 


L. a[0001 
1011 
1101 
0000 

b [01100 
00001 
10010 
00100 
00100 

с. [010100 
100000 
010111 
000001 
000001 
001010 

2. a 5 P. 


Р, Р, 


Р, Р, 
Р, Р, 
c. | | | | IS 
P, 6 Р; Р, 
Р, 
Р, Р, 


b. 1- step: PP 
2—step: Pi — P4— Pa 
Pi 34 P34 Pa 
3—step: Рү + P3 — Р Р 
Pi > Рз Ра Р» 
Pi Ра Рз — P3 
c. 1 step: Pi — Pq 
2 = step: Py — P4— Рд 
3 —= step: Pi — Рэ — P1 Рд 
Pi > Py Рз Рд 


0 


ооо н 
о о н о 
онно о 
се № ~ © 
= о о о 


0 0 
(c) The i jth entry is the number of family members who influence both the ith and jth family members. 
5. а, (Рі, Pa, P3) 
b. (Рз, Ра, P5) 
с. (P3, P4, Pg, Pg) and (Рд, P5, Pg) 


a. None 
b. (Ps, P4, Рв) 
7. [00 1 1 Power of Py = 5 
100 Q| Power of P27 = 3 
0 1 0 1| Power of P5—4 
0 1 0 0| Power of P4—2 


8. First, A; second, B and E (tie); fourth, C; fifth, D 
Exercise Set 10.7 


L a —5/8 
b. [0 1 0] 


c. [10 0 0]? 
* Let A= [| if: for example. 


3 * * 
a o ail, q - 1) у= 


Бро=[0 10], “= | ym 


p =[0 01], q" 


1 
0 
c. 0 
1 
0 


а, 1 
[8 2]. v-[| 73 
8 
b 1 
в'=[2 3h a y= 
6 
с. 1 
ве 010, «|5 »=3 
0 
4. 3 
p=(3 3} =] dae: 
5 
е. EA 
"= |5 3} =| > v= 3 
1 
5 It 
*. [ЛЗ T * |20 _ _3 
p=|33 бї =|у у= 720 
20 


с. | 78 
54 
79 


D 


. Use Corollary 10.8.4; all row sums are less than one. 


b. Use Corollary 10.8.5; all column sums are less than one. 
e. 2 19 
Use Theorem 10.8.3, with x= | 1 |= Сх= | 9|. 
1 J 


. Е? has all positive entries. 


. $1256 for the CE, $1448 for the EE, $1556 for the ME 
(b) 242 
503 
Exercise Set 10.9 
1. The second class; $15,000 
2. $223 
3. 1:1.90:3.02:4.24:5.00 
5 
6 


3 
4. Price of tomatoes, $120.00; price of corn, $100.00; price of lettuce, $106.67 
5 
6. 


-sie m Bg) 
1:2:3: : a= 


Exercise Set 10.10 


з [0110 
0 
0 


о 


© c 
© кюре 


(b) 


0  .866 1.366 .500 
0 = ү „ 
0 


(0,0,0), (1,0,0), (14, 1,0), and (1,1,0) 


(с) (0,0,0), (1,.6,0), (1, 1.6, 0), (0,1,0) 


а. 


R= 


1 0 
0 —-10 
0 0 


Р: 
det о 


І 
2 
0 о 
0 0 sin 20 


о o Юе 
o o tl 


cos (—45) 0 sin(—45) 0 
Ma4— 0 1 0 , M5=/1 
—sin(—45) 0 cos(—45) 0 


Ри = М5МАМЗз(МҮР+ М) 


0 0 Коч р. 11 
5 0|, М=|0 cos45 —ш45 |, Ma3—|0 0 
0 1 0 0 


3 
Муү=|0 
0 0 sin45  cos45 


‚ M3—|0 cos20 


0 
—sin 20° 
cos 20° 


соз 35° 0 sin35' cos(—45) -sin(—45) 0 


Ma=| 0 1 0 |, Ms—|sn(-45) cos(—45) 0} 


—sin35' 0 cos35 0 0 


00... 0 2 
М=|0 0 --- 0| Му=|0 
11 1 0 


oro 
= © © 


Pr = MMs Ma (MMP + Мз) + Mg) 


cos 0 sng cosa -—sina 0 
0 1 0 |, R2|sne cosa 0 |, 
=sing 0 cos 0 0 1 


cosÜ 0 sn cosa sina 0 
0 1 0 | R4,2|-sna cosa 0 |, 
—sinÜ 0 созӣ 0 0 1 


cos 0 -—sng 
0 1 0 
sing 0 cos 


1 


Ш 
оо о ~ 
ooo 


о н о о 
=- оооло но © 


oo о н 
оо н © 


Exercise Set 10.11 


1. 


а. 


EN 
[n 

о ын Bl o 
© 


Blo Blo Alw Blo 


Mu 


rol о кюе о 
D 
| 


РА 


20 


о M^ 
о 


fy 
£2 
£3 
£4 


© 
о A Aje 


I 


caja cal colin colo 


он © юре © 


16 
п 
16 


d. for£; and tz, — 12.9%; for £z and £4, 5 2% 


Exercise Set 10.12 
^ © (3 2) 


2. 


22' 22 
а. x? — (140000 
x? — (141000 
х? = (1.40900 
x? = (140910 
х? = (1.40909 
хб? = (1.40909 


b. Same as part (а) 


Е «2 = (9.55000, 25.65000) 


16 


‚ 1.20000) 
‚ 1.23000) 
‚ 1.22700) 
‚ 1.22730) 
‚ 1.22727) 
‚ 1.22727) 


7 
16 


x® = (59500, — 1.21500) 


х9 = (1.49050, 1.47150) 
x$? = (1.40095, 1.20285) 
х? = (140991, 1.22972) 
х = (140901, 1.22703) 


21 
16 


16 
16 


Mom 


4. xi = (1, 1)›х)у = (2,0), x; = (1,1) 
7. x7 + xg + хо = 13.00 
X4- х5 + xg = 15.00 
x1 х2 + х3 = 8.00 
.82843(xg + xg) + .58579x9 = 14.79 
1.41421(х3 + x5 + x7) = 14.31 
82843 (x2 + хд) +.58579х = 3.81 
x3 + x6 + хо = 18.00 
x2 + х5 +xg= 12.00 
xi +x4+ х7 = 6.00 
.82843(х2 + xg) + .58579х3 = 10.51 
1.41421(х + х5 + х9) = 16.13 
82843 (x4 + xg) + .58579ху = 7.04 
8. x7 + х. + хо = 13.00 
x4+ х5 + xg = 15.00 
xi +x2+x3= 8.00 
04289 (x3 + x5 + x7) + .75000(x6 + xg) + .61396x9 = 14.79 
91421 (x3 + x5 + x7) + .25000(x2 + x4 + xg + xg) = 14.31 
04289 (x3 + x5 + x7) +.75000(x2 + x4) --.61396x, = 3.81 
x3 + xg + хо = 18.00 
хо +x5+xg= 12.00 
xy +x4+x7= 6.00 
04289 (x4 + x5 + x9) + .75000(х2 + xg) + .61396х3 = 10.51 
91421 (x1 + x5 + x9) + .25000 (x2 + x4 + xg + xg) = 16.13 
.04289 (x4 -- x5 + x9) + .75000(х4 + xg) --.61396x5 = 7.04 


Exercise Set 10.13 
13] [0 12 
х 12 [1 0|[х e|. е; 0 25 25 
[Ту 12 = ' = (4) /|25-| = 1.888... 
ЩН) 2 TEM 1,2,3,4, where the four values of [М are Hi H H and | 2 ан) n4) (22) 1.888 


2. gg AT; d g(S) x In(4) / In(1/.47) = 1.8. ... Rotation angles: 9° (upper left); —99° (upper right); 180° (lower left); 180° (lower right); 
(0, 0, 0), (1, 0, 0), (2, 0, 0), (3, 0, 0), (0, 0, 1), (0, 0, 2), (1, 2, 0), (2, 1, 3), (2, 0, 1), (2, 0, 2), (2, 2, 0), (0, 3, 3) 


a. (i)g— 5 (ii) all rotation angles are 0°; (iii) d g(5) = In(7) / In(3) = 1.771 . ... This set is a fractal. 

b. @s= 2, (ii) all rotation angles are 180°; (iii) d н(5) = (3) / (2) = 1.584 . ... This set is a fractal. 

с. (i)g— 2; (ii) rotation angles: —90 (top); 180° (lower left); 180° (lower right); (iii) 2 (5) = (3) / In(2) = 1.584 . ... This set is a fractal. 

d. @s= 5 (ii) rotation angles: 90° (upper left); 180° (upper right); 180° (lower right) (iii) 2 (S) —1n(3) / In(2) = 1.584 . ... This set is a fractal. 


$—.8509.,8— —2. 69°... 


(0.766, 0.996) rounded to three decimal places 
d g(S) = (16) /1n(4) =2 


ща (3-488... 


OP EN ues т 


9. dg(S)-—ln(8) /In(2) = 3; the cube is not a fractal. 
10. E —20;s— 2; d g(S) = (20) /In(3) = 2.726..; the set is a fractal. 


п. 
Initial set 


[ First iterate 


Second iterate 


Third iterate 
* Fourth iterate 


dg(S) =In(2) / In(3) = 0.6309... 


12. 8 8 2 8 3 8 4 
Area of Sg = 1; area of Sy = g^ 0.888... ; area of $5 = 5) = 0.790... ; area of $5 = Hi = 0.702... ; area of 54 = D — 0.624... 


Exercise Set 10.14 
1. I1(250) = 750, П(25) = 50, I1(125) = 250, П(30) = 60, П(10) = 30, П(50 


l> 1 


‚ П(3750) = 7500, П(6) = 12, П(5) = 10 


2. Опе I-cycle: ((0, 0)) ; one 3-cycle: [5 0|, 3, 3 (б. 2\\; two 4-cycles: Е i l2 | T: 0}, (2. 2) апа (б. 3i ls 3 (0 3 b 2 
[203626203 62026262 01 
СООО 696062 096004] mom 


3- (a) 3,7, 10, 2, 12, 14, 11, 10, 6, 1, 7, 8, 0, 8, 8, 1,9, 10, 4, 14, 3, 2, 5, 7, 12, 4, 1, 5, 6, 11, 2, 13, 0, 13, 13, 11, 9, 5, 14, 4, 3, 7,.. 
(с) (5, 5), (10, 15), (4, 19), (2, 0), (2, 2), (4, 6), (10, 16), (5, 0), (5, 5).... 


(c) C 20А ЕЕЕ === === — „=“. „ж 
The first five iterates of [1s 0) are (zer. gor) (3er ter} Ger тег) (5 rx) Е ) 


(b) The matrices of Anosov automorphisms are (2 | алй E || 


(c) The transformation affects a rotation of S through 90° in the clockwise direction. 


9 TNR (l, 1) (0.1) (112,1) (1.1) 


b] [ 1 а 
(0, Е ер 1/2) b] [ 2 b H5 


(0, LOS (1,0) (0.0) (1/2,0) (1,0) 


I юп: [2119 1.; iam [211 9, iam. |?) =| 711; onr: edi Бы! 
n region [Ble 0 ; 1n region [ele =j ; ш region [Ble =ï ; ш region [ele =) 


12. (1 3 4 2 21 34 
(5. 3) ana (4, © form one 2- cycle, and (3. 5) and (3. 2) form another 2-cyle 


14. Begin with a 101 x 101 array of white pixels and add the letter ‘A’ in black pixels to it. Apply the mapping to this image, which will scatter the black pixels 
throughout the image. Then superimpose the letter ‘B’ in black pixels onto this image. Apply the mapping again and then superimpose the letter “С” in black pixels 
onto the resulting image. Repeat this procedure with the letters ‘D’ and ‘E’. The next application of the mapping will return you to the letter ‘A’ with the pixels for 
the letters *B' through ‘E’ scattered in the background. 

Exercise Set 10.15 


l. a. GIYUOKEVBH 


b. SFANEFZWJH 
2 a.a ,4 [12 7 
A =|; ‚| 


b. Not invertible 


c 442. | 1 19 
4 =|; | 


d. Not invertible 
e. Not invertible 


f ,-1_[15 12 
A = 
E | 


3. WE LOVE MATH 

^ Deciphering matrix — [ А; enciphering matrix = |; is] 
5. THEY SPLIT THE ATOM 

6. IHAVE COME TO BURY CAESAR 


7. а. 010110001 


b. |0 1 1 
l1 1 
10-1 


8. Ais invertible modulo 29 if and only if det(A) # 0 (mod 29). 


Exercise Set 10.16 
2. 


n+l 
an=4+(5) (ap — co) а 1 
b.-1l uj oral 
п=5 п= n= ) as п 00 
n+l 1 
en=4-(3) (ag — co) бл 
4: 
tant = $ + уя Qao — bo — 40) 
1 _ п=0,1,2,... 
P254 37 604)" (2ag — bg — 4с0) 
€2241 = 0 
а2 -— + 1 (2ag — bg = 460) 
” 12 e" 
bal n=1,2 


TN МОНО ET E ИНЕ 
©2и= 15 HO (Zag = bg — 4с0) 


` Eigenvalues: Ay = 1,35 = h eigenvectors: еј = || ез = | | 


5. 12 generations; .006% 
6. 1 1 | 
Tta gurl 3-44 95)" «cse 5a- 49" 
1 1 n+l n+l 
3 ger lt» +95) I 1 
1 1 n = n 0 
3 auri +@-{5) ] о 
х0? = xe) _, аз # — оо 
1.1 чаж" «a-/5^ | : 
3 antl [A+ 0 
1 1 и+1 n+l 1 
3 ult +@-үзу) ] 2 
+1 +1 
Tts ger 3-4 + 95)" «cos 5a- 9^1 
8/1 0 0 0 
0000 
0000 
0001 


Exercise Set 10.17 


b. 9090). ы! а, Mens 951) 


50 50 88 125 191 
c. OOl] „® „у „®._|855 
к= ү када 287 
Te 2:375 
8. 1.49611 


Exercise Set 10.18 


Yield = 331% of population; ху = 


Yield = 45.894 of population; x = ; harvest 57.9% of youngest age class 


= 
ole юе, = 
LLL LL La | ој OL 


2. 1.000 2.090 
845 ‚845 
824 824 
795 ‚795 
755 ‚755 
_ | .699 _ | .699 1.090 + .418 _ 
mm с, 5] e) pee T 
532 ‚532 
0 418 
0 0 
0 0 
0 0 
4, hp (R— 1) / (ауда s + i bp e + + + + aybibas c tby) 
5. p @1tagbi t+ +++ + (aysibiba* + 5j 2)-1 
ару s bp b c cc Фала c by 
Exercise Set 10.19 
1. я2 4 
3*4 cost + cos 2t + cos 3t 
2. T? Т? [o 28, L cos 49,4 1 cos EE p cos ER. 
3 + = cos pit cos pit ji cos pr cos н) 


Тү. 2®, lo 49.1. бт, lus т 
(«++ sin +1 sin би F sin Fr) 


1 _ 1 
5.7 UC a- 1)(2я + 1) 


1 10л? 1 aunt 
oF Ce SE os. 
T 10? T (2и)? Т | 


cos at) 


Exercise Set 10.20 


1. 2 2 


а. Yes; y= in + 52 + 5%3 


b. No; v = 2n 4 in - in 


©. Yes; v = 2n + in + буз 
6 5 


d. Yes; ү = ivi + 1572 + 15 


2. m= number of triangles = 7, м = number of vertex points = 7,  — number of boundary vertex points — 5; Equation (7) is 7 = 2(7) — 2 — 5. 
з. w= Mv +b = M (civ + сууз + сзуз) + (ср +02 + c3)b 
— cu (Mv, +b) + eo2( Mv + b) --e3(Mv3 +b) = сүл + след + c3wa 


4. MI у? 


hi 


v3 


Va 


эе 
ел 
ane 
с=ш= 


a. Two of the coefficients are zero. 
b. At least one of the coefficients is zero. 


c. None of the coefficients are zero. 


КҮЛ КИ! 
зї! + зҮ? + v3 


B 


= 
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